Jump to Main Content
A simple pipeline for the assessment of legacy soil datasets: An example and test with soil organic carbon from a highly variable area
- Schillaci, Calogero, Acutis, Marco, Vesely, Fosco, Saia, Sergio
- Catena 2019 v.175 pp. 110-122
- arable soils, bulk density, data collection, databases, farmers, funding, georeferencing, land use, linear models, orchards, prediction, soil organic carbon, soil properties, vineyards, woodlands, Italy, Sicily
- Legacy databases provide unique information on soil properties and act as a guide for the setup of monitoring processes. However, their use requires an evaluation of their drawbacks, especially when aiming to model the soil traits by depth. We set up a procedure for the integration and error correction of a soil legacy database. This database consisted of 6994 records in its original form and 6674 records after correction. These records were collected from 2886 locations in the south of Italy on a 25,711-km2 island (Sicily, Italy). Samples were taken in arable lands (5471 records), orchards, vineyards and seminatural lands (3010 records), and woodland and natural areas (1203 records). The procedure for the integration and error highlighting improved the prediction of soil organic carbon (SOC), and a general linear model with covariate selection by Least Absolute Shrinkage and Selection Operator (LASSO) tested the procedure. We focussed on exploring the amount of legacy information as georeferenced soil properties. SOC and fine earth fractions were analysed for each sample. Bulk density was provided for only 20% of the samples. These results will help to account for the legacy data available and propose an analysis to harmonize an SOC dataset; highlight missing or incorrect data; summarize data; and offer synthesis criteria for benchmarking SOC in different land uses and pedological areas. In addition, the results may stimulate funding bodies to support research in an open data frame, which can be turned into more sustainable use of resources, improved communication between governments and farmers, and the production of standard datasets that meet and facilitate the requirements for regional agro-environmental modelling.