Main content area

Development of a national-scale real-time Twitter data mining pipeline for social geodata on the potential impacts of flooding on communities

Barker, J.L.P., Macleod, C.J.A.
Environmental modelling & software 2019 v.115 pp. 213-227
artificial intelligence, floods, georeferencing, prototypes, risk, rivers, social networks, spatial data, stakeholders, Great Britain
Social media, particularly Twitter, is increasingly used to improve resilience during extreme weather events/emergency management situations, including floods: by communicating potential risks and their impacts, and informing agencies and responders. In this paper, we developed a prototype national-scale Twitter data mining pipeline for improved stakeholder situational awareness during flooding events across Great Britain, by retrieving relevant social geodata, grounded in environmental data sources (flood warnings and river levels). With potential users we identified and addressed three research questions to develop this application, whose components constitute a modular architecture for real-time dashboards. First, polling national flood warning and river level Web data sources to obtain at-risk locations. Secondly, real-time retrieval of geotagged tweets, proximate to at-risk areas. Thirdly, filtering flood-relevant tweets with natural language processing and machine learning libraries, using word embeddings of tweets. We demonstrated the national-scale social geodata pipeline using over 420,000 georeferenced tweets obtained between 20 and 29th June 2016.