Main content area

Data Preprocessing Method for Liquid Chromatography–Mass Spectrometry Based Metabolomics

Wei, Xiaoli, Shi, Xue, Kim, Seongho, Zhang, Li, Patrick, Jeffrey S., Binkley, Joe, McClain, Craig, Zhang, Xiang
Analytical chemistry 2012 v.84 no.18 pp. 7963-7971
algorithms, data collection, ions, liquid chromatography, liver, mass spectrometry, metabolites, metabolomics, mice, regression analysis
A set of data preprocessing algorithms for peak detection and peak list alignment are reported for analysis of liquid chromatography–mass spectrometry (LC–MS)-based metabolomics data. For spectrum deconvolution, peak picking is achieved at the selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into the z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC–MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers demonstrates that the developed data preprocessing method performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS², for peak picking, peak list alignment, and quantification.