Main content area

Separation of geochemical anomalies from the sample data of unknown distribution population using Gaussian mixture model

Chen, Yongliang, Wu, Wei
Computers & geosciences 2019 v.125 pp. 9-18
computers, graphs, models, population distribution, probability, probability distribution, support vector machines, surveys, China
The separation of geochemical anomalies from the sample data of unknown distribution population is a great challenge, as it is difficult to determine the correct model for the unknown population distribution. Gaussian mixture model is a linear combination of several Gaussians. By using enough number of Gaussians and by adjusting parameters, the model can generate very complex probability density, which can approximate almost any continuous probability. Therefore, the Gaussian mixture model can fit the sample data of unknown distribution population, and those data points that do not conform to the model are considered as anomalies. The method was used to separate multivariate anomalies from the geochemical survey data of 1:200,000 scale collected from the Baishan district, Jilin Province, China, and compared with one-class support vector machine. The programs running the two models took 18.67 and 32.14 s, respectively; the receiver operating characteristic curves of the two models intersect each other in the ROC space; and area under the curves of the two models are 0.851 and 0.855 respectively. The “best” threshold determined by using the Youden index was used to separate geochemical anomalies. The anomalies separated from the modeling results of the two models occupy respectively 14.46% and 14.49% of the study area and contain respectively 83% and 70% of the known mineral deposits. Therefore, Gaussian mixture model is comparable to one-class support vector machine in geochemical anomaly detection. It can be used as a geochemical anomaly detector with high performance and data modeling efficiency.