Main content area

Some Consequences of Using the Horsfall-Barratt Scale for Hypothesis Testing

Bock, C.H., Gottwald, T.R., Parker, P.E., Ferrandino, F., Welham, S., Bosch, F. van den, Parnell, S.
Phytopathology 2010 v.100 no.10 pp. 1030
plant diseases and disorders, disease severity, data analysis, statistical analysis, methodology, accuracy, simulation models, leaves, model validation, equations, mathematical models, sampling, variance
Comparing treatment effects by hypothesis testing is a common practice in plant pathology. Nearest percent estimates (NPEs) of disease severity were compared with Horsfall-Barratt (H-B) scale data to explore whether there was an effect of assessment method on hypothesis testing. A simulation model based on field-collected data using leaves with disease severity of 0 to 60% was used; the relationship between NPEs and actual severity was linear, a hyperbolic function described the relationship between the standard deviation of the rater mean NPE and actual disease, and a lognormal distribution was assumed to describe the frequency of NPEs of specific actual disease severities by raters. Results of the simulation showed standard deviations of mean NPEs were consistently similar to the original rater standard deviation from the field-collected data; however, the standard deviations of the H-B scale data deviated from that of the original rater standard deviation, particularly at 20 to 50% severity, over which H-B scale grade intervals are widest; thus, it is over this range that differences in hypothesis testing are most likely to occur. To explore this, two normally distributed, hypothetical severity populations were compared using a t test with NPEs and H-B midpoint data. NPE data had a higher probability to reject the null hypothesis (H0) when H0 was false but greater sample size increased the probability to reject H0 for both methods, with the H-B scale data requiring up to a 50% greater sample size to attain the same probability to reject the H0 as NPEs when H0 was false. The increase in sample size resolves the increased sample variance caused by inaccurate individual estimates due to H-B scale midpoint scaling. As expected, various population characteristics influenced the probability to reject H0, including the difference between the two severity distribution means, their variability, and the ability of the raters. Inaccurate raters showed a similar probability to reject H0 when H0 was false using either assessment method but average and accurate raters had a greater probability to reject H0 when H0 was false using NPEs compared with H-B scale data. Accurate raters had, on average, better resolving power for estimating disease compared with that offered by the H-B scale and, therefore, the resulting sample variability was more representative of the population when sample size was limiting. Thus, there are various circumstances under which H-B scale data has a greater risk of failing to reject H0 when H0 is false (a type II error) compared with NPEs.