Jump to Main Content
Using R² to compare least-squares fit models: When it must fail
- Tellinghuisen, Joel, Bolster, Carl H.
- ARS USDA Submissions 2011 v.105 no.2 pp. 220
- least squares, models, variance
- R² can be used correctly to select from among competing least-squares fit models when the data are fitted in common form and with common weighting. However, when models are compared by fitting data that have been mathematically transformed in different ways, R² is a flawed statistic, even when the data are properly weighted in accord with the transformations. The reason is that in its most commonly used form, R²can be expressed in terms of the excess variance (s²) and the total variance in y (sy ²) — the first of which is either invariant or approximately so with proper weighting, but the second of which can vary substantially in data transformations. When given data are analyzed “as is” with different models and fixed weights, sy ² remains constant and R² is a valid statistic. However, then s², and χ² in weighted fitting, are arguably better metrics for such comparisons.