Main content area

VGI and crowdsourced data credibility analysis using spam email detection techniques

Koswatte, Saman, McDougall, Kevin, Liu, Xiaoye
International journal of digital earth 2018 v.11 no.5 pp. 520-532
Bayesian theory, data collection, disaster preparedness, e-mail, floods, models, professionals, spatial data
Volunteered geographic information (VGI) can be considered a subset of crowdsourced data (CSD) and its popularity has recently increased in a number of application areas. Disaster management is one of its key application areas in which the benefits of VGI and CSD are potentially very high. However, quality issues such as credibility, reliability and relevance are limiting many of the advantages of utilising CSD. Credibility issues arise as CSD come from a variety of heterogeneous sources including both professionals and untrained citizens. VGI and CSD are also highly unstructured and the quality and metadata are often undocumented. In the 2011 Australian floods, the general public and disaster management administrators used the Ushahidi Crowd-mapping platform to extensively communicate flood-related information including hazards, evacuations, emergency services, road closures and property damage. This study assessed the credibility of the Australian Broadcasting Corporation’s Ushahidi CrowdMap dataset using a Naïve Bayesian network approach based on models commonly used in spam email detection systems. The results of the study reveal that the spam email detection approach is potentially useful for CSD credibility detection with an accuracy of over 90% using a forced classification methodology.