Main content area

Machine learning links seed composition, glucosinolates and viability of oilseed rape after 31 years of long-term storage

Nagel, Manuela, Holstein, Katharina, Willner, Evelin, Börner, Andreas
Seed science research 2018 v.28 no.4 pp. 340-348
Brassica napus, alpha-linolenic acid, artificial intelligence, erucic acid, fatty acid composition, genotype, glucosinolates, half life, least squares, lipid content, neural networks, oleic acid, seed longevity, seed oils, seedlings, seeds, stearic acid, storage temperature, storage time, viability
Seed longevity is influenced by many factors, a widely discussed one of which is the seed lipid content and fatty acid composition. Here, linear and non-linear regressions based on machine learning were applied to analyse germinability and seed composition of a set of 42 oilseed rape (Brassica napus L.) accessions grown under the same single environment and at the same time following a period of up to 31 years storage at 7°C. Mean viability was halved after 27.0 years of storage, but this figure concealed a major influence of genotype. There was also wide variation with respect to fatty acid composition, particularly with respect to oleic, α-linolenic, eicosenoic and erucic acid. Linear regression (rL) revealed significant correlation coefficients between normal seedling appearance and the content of α-linolenic acid (+0.52) and total oil (+0.59). Multivariate regression using artificial neural networks including a radial basis function (RBF), a multilayer perceptron (MLP) and a partial least square (PLS) recognized underlying structures and revealed high significant correlation coefficients (rM) for oil content (+0.87), eicosenoic acid (+0.75), stearic acid (+0.73) and lignoceric acid (+0.97). Oil content or a combination of oleic, α-linolenic, arachidic, eicosenoic and eicosadienoic acids and glucosinolates resulted in highest model fitting parameters R² of 0.90 and 0.88, respectively. In addition, the glucosinolate content, predominantly in the Brassicaceae family and ranging from 4.6 to 79.5 µM, was negatively correlated with viability (rL = ‒0.43). Summarizing, oil content, some fatty acids and glucosinolates contribute to variations in average half-life (15.2 to 50.7 years) of oilseed rape seeds. In contrast to linear regression, multivariate regression using artificial neural networks revealed high associations for combinations of parameters including underestimated minor fatty acids such as arachidic, stearic and eicosadienoic acids. This indicates that genetic and seed composition factors contribute to seed longevity. In addition, multivariate regressions might be a successful approach to predict seed viability based on fatty acids and seed oil content.