Abstract

Visualizing data is crucial for exploratory data analysis, checking assumptions, and validating model performance. However, visualization quickly becomes complex as the dimensionality of the data or features increases. Traditionally, linear projections have been used to view discrete pairs of components to mitigate this complexity. Data visualization tours are a class of dynamic linear projections that animate many linear projections over small changes to the projection basis. The permanence of observations between nearby frames potentially conveys more information than discrete orthogonal frames alone.

Tours are categorized by the path of their bases. Manual tours uniquely allow for user-controlled steering of a path of bases, where the contributions of individual variables can be changed. The radial tour is a specific case of the manual tour, which freezes the angle of movement but allows the magnitude to be varied. The radial tour is central to the work covered in this thesis. Chapter 3 clarifies the theoretical basis of the radial tour. The details of implementation and illustration of its use are provided. It introduces an open-source R package that facilitates creating these tours with various display choices and user controls.

The user steering of the radial tour should allow the analyst to better understanding of the variable importance to any structure revealed in a projection. Chapter 4 discusses a within-participant user study comparing the radial tour with current practices: principal component analysis (PCA) and the original tour type without steering. The \(n=108\) crowdsourced participants performed a variable attribution task describing the separation between clusters. The results find that the radial tour leads to a more accurate variable attribution. Participants also reported that the radial tour was their preferred visualization method.

Nonlinear modeling techniques are sometimes referred to as black-box models due to the difficulty of interpreting the model terms. Recent research in Explainable Artificial Intelligence (XAI) tries to bring interpretability to these models through local explanations. Local explanations are a class of techniques that approximate the linear-variable importance for prediction at one point in the data. Chapter 5 provides a new approach for exploring the variable sensitivity of local explanations using the radial tour. Local explanations can be considered projection bases. The radial tour is used to vary the contribution of a variable to assess its importance for any particular prediction. This novel analysis is illustrated using four contemporary data examples covering classification and regression. An accompanying R package provides a graphical user interface (GUI) for conducting this analysis.