Correspondence analysis (CA) has a special place in data science as it analyses data measured on categorical scales and results in visualizations that facilitate interpretation and understanding of multivariate categorical data. CA is primarily a method of unsupervised learning, that is it is designed to identify structures that are latent in the data, for example dimensions that identify the greatest differences in the observations as well as their similarities and groupings. Inherent in CA is a measure of distance that quantifies the proximities between observations, based on categorical data, that can also lead to formal identification of clusters. The method has found extensive applications in sociology, linguistics, archaeology and genetics. This one-day workshop will focus on its applicability and usefulness in sensometrics.
In the first part of the course, before lunch, I will explain the basic ideas of correspondence analysis, including measures of distance and the interpretation of the biplot, both at the heart of the data visualization. Some simple applications in sensometrics will be presented as well as the implementation of the method using R software.
After lunch, the way CA extends to more complex data will be explained, the most important extension being multiple correspondence analysis (MCA), which treats several categorical variables
simultaneously. The case of multi-way data in the CA/MCA context is also considered; for example, multi-block or multi-occasion data. Some more challenging sensometric data sets will be analysed, where the full versatility of the approach is demonstrated. Finally, the role of CA/MCA in supervised learning is discussed, when there is a specific response variable being modelled in terms of categorical predictors.
Michael Greenacre is “Senior Talent Professor” at the Universitat Pompeu Fabra in Barcelona and
affiliated professor of the Barcelona School of Management. His academic work centres around
methods for analyzing multivariate data, having specialized in correspondence analysis since his
doctoral studies with Jean-Paul Benzécri and then in compositional data analysis after collaborations
with both John Aitchison and Paul Lewi. He has over 100 scientific publications in international refereed
journals and has written or co-edited 12 books, including Theory and Applications of Correspondence
Analysis (1984), three separate editions of Correspondence Analysis in Practice (1993, 2007 and 2016)
and most recently, Compositional Data Analysis in Practice (2018). He has given short courses in 15
countries around the world, mostly to marine biologists but also to market researchers and
statisticians. He has also been the co-organizer, with Prof. Jörg Blasius in Bonn, Germany, of the
successful series of quadrennial conferences called CARME (Correspondence Analysis and Related
Methods), which have taken place since 1991, in Cologne (3 times), Barcelona, Rotterdam, Rennes,
Naples, Stellenbosch (South Africa) and the latest one in Bonn, 2023. Michael has a wide range of
interests in the world of data analysis and has worked with sociologists, Arctic ecologists, biologists,
biochemists, geochemists and geneticists. He is also a musician and has two CD albums published of
his own music, and is well-known for his satirical statistical songs on the YouTube channel