Categorical CVA biplots

D.T. Rodwell, C.J. van der Merwe, S. Gardner-Lubbe
Computational statistics & data analysis 2021 v.163 pp. 107299
data analysis, data collection, multivariate analysis, mushrooms, principal component analysis
Techniques to visualise and understand large amounts of data are of paramount importance. In most settings, this data is usually multivariate, which further stresses the need for effective visualisation techniques. Multivariate visualisation techniques such as canonical variate analysis (CVA) biplots allow for simultaneous lower-dimensional visualisation and data classification by incorporating class-specific data. CVA biplots, however, are currently restricted to numerical data. Through combining concepts from both CVA and non-linear principal component analysis (PCA) biplots, a new biplot construction methodology that improves on the traditional CVA biplot by allowing for categorical variables is proposed. This technique, named CVA(Hᵣ), is showcased using the established mushroom data set, which contains a mix of categorical and ordinal variables. This novel method improves upon existing biplot construction in terms of classification accuracy and class separation.