U.S. flag

An official website of the United States government


Main content area

Effect of Missing Values on Variance Component Estimates in Multienvironment Trials

Aguate, Fernando, Crossa, Jose, Balzarini, Mónica
Crop science 2019 v.59 no.2 pp. 508-517
Triticum aestivum, cultivars, data collection, models, variance, wheat
Multienvironment trials (METs) are conducted to evaluate cultivars across locations and years with often incomplete data structure due to annual cultivar replacements. The imbalance could cause biased variance component (VC) estimates depending on data dimension, proportion of missing values, and the cultivar dropout mechanism. The objective of this study was to quantify the bias of VC estimates obtained from imbalanced datasets. We performed simulations of METs with different data dimensions (number of cultivars, locations, and years) using VC parameters taken from real wheat (Triticum aestivum L.) METs. The missing values were generated by annually dropping and replacing cultivars. The genotypic variance estimates obtained from analyses of 2 yr of METs, and >40% missing values, were overestimated in all simulated scenarios. The percentage of bias was highly influenced by the number of years considered for analysis. Variance component estimates from simulations with more years of METs were less biased: 8-yr analyses produced <5% bias in the genotypic variance and its interactions, even in highly imbalanced datasets. Increasing the number of annually tested cultivars or the number of locations was less beneficial in terms of decreasing bias than increasing the number of years. Cultivar-mean repeatability was considerably affected by increases in the percentage of missing values, which caused reductions of up to 60% with few years of METs. Results showed that, even with cultivar replacement, linear mixed models can estimate VCs with <5% bias when there are four or more years of METs, with or without imbalance (up to 40%).