Main content area

Machine learning approaches and their current application in plant molecular biology: A systematic review

Silva, Jose Cleydson F., Teixeira, Ruan M., Silva, Fabyano F., Brommonschenkel, Sergio H., Fontes, Elizabeth P.B.
Plant science 2019 v.284 pp. 37-47
algorithms, artificial intelligence, data collection, databases, genes, genomics, immunity, models, molecular biology, pathogens, plant genetics, systematic review
Machine learning (ML) is a field of artificial intelligence that has rapidly emerged in molecular biology, thus allowing the exploitation of Big Data concepts in plant genomics. In this context, the main challenges are given in terms of how to analyze massive datasets and extract new knowledge in all levels of cellular systems research. In summary, ML techniques allow complex interactions to be inferred in several biological systems. Despite its potential, ML has been underused due to complex computational algorithms and definition terms. Therefore, a systematic review to disentangle ML approaches is relevant for plant scientists and has been considered in this study. We presented the main steps for ML development (from data selection to evaluation of classification/prediction models) with a respective discussion approaching functional genomics mainly in terms of pathogen effector genes in plant immunity. Additionally, we also considered how to access public source databases under an ML framework towards advancing plant molecular biology and introduced novel powerful tools, such as deep learning.