Main content area

A Big Data approach to forestry harvesting productivity

Rossit, Daniel Alejandro, Olivera, Alejandro, Viana Céspedes, Víctor, Broz, Diego
Computers and electronics in agriculture 2019 v.161 pp. 29-52
Eucalyptus, algorithms, analytical methods, computers, data collection, decision support systems, forest industries, forestry, forests, harvesters, harvesting, models, plantations, regression analysis, Uruguay
Modern industrial technology enables to collect and process large amount of data, providing valuable information for different industry activities. A representative case of this evolution is the Forest industry, since modern forest harvesters are equipped with automatic data collection devices. The collected data can be extracted and communicated to computers using special forestry protocols, as StanForD, where it can be analysed. This skill of modern harvesters allows to study harvest productivity with thousands of records, instead of having a few hundred as it would be possible by recording through traditional methods (visual inspection or filming). However, traditional analytical methods, as linear regression, are not capable to deal with this volume of data (or, at least, does not take full advantage of the data potential), consequently, new approaches must be considered. Our proposal is to address this shortcoming using data mining methods, specially, we consider decision trees and k-means algorithms. We study how different variables (DBH, species, shift and operator) affect the productivity of a forest harvester considering real scenario data. The harvest data comes from Eucalyptus spp. plantations in Uruguay where the harvest system implemented is cut-to-length. To analyse the data, firstly, productivity is modelled in a categorical manner considering two different approaches: ranges of equal intervals and ranges calculated using k-means clustering algorithm. Then, Decision Trees methods are applied to analyse the influence of the mentioned variables in productivity. The results show that clustering is a proper approach to categorically model scalar productivity and that DBH is the most influential factor in productivity. Moreover, Decision Trees, after setting DBH values, allowed to use new variables to describe productivity, achieving very high levels of accuracy, in many cases greater than 90%.