Bibliography

A. Adadi and M. Berrada. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE access, 6: 52138–52160, 2018.
J. A. Anderson. An introduction to neural networks. MIT press, 1995.
F. J. Anscombe. Graphs in Statistical Analysis. The American Statistician, 27(1): 17–21, 1973. URL http://www.jstor.org/stable/2682899.
L. Arms, D. Cook and C. Cruz-Neira. The benefits of statistical visualization in an immersive environment. In Virtual Reality, 1999. Proceedings., IEEE, pages. 88–95 1999. IEEE.
A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. García, S. Gil-López, D. Molina and R. Benjamins. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58: 82–115, 2020.
D. Asimov. The Grand Tour: A Tool for Viewing Multidimensional Data. SIAM journal on scientific and statistical computing, 6(1): 128–143, 1985. DOI https://doi.org/10.1137/0906011.
A. Batch, A. Cunningham, M. Cordeil, N. Elmqvist, T. Dwyer, B. H. Thomas and K. Marriott. There is no spoon: Evaluating performance, space use, and presence with expert domain users in immersive analytics. IEEE transactions on visualization and computer graphics, 26(1): 536–546, 2019.
R. A. Becker and W. S. Cleveland. Brushing Scatterplots. Technometrics, 29(2): 127–142, 1987. URL https://www.tandfonline.com/doi/abs/10.1080/00401706.1987.10488204.
E. Bertini, A. Tatu and D. Keim. Quality metrics in high-dimensional data visualization: An overview and systematization. IEEE Transactions on Visualization and Computer Graphics, 17(12): 2203–2212, 2011.
P. Besse, C. Castets-Renard, A. Garivier and J.-M. Loubes. Can Everyday AI be Ethical? Machine Learning Algorithm Fairness. Machine Learning Algorithm Fairness (May 20, 2018). Statistiques et Société, 6(3): 2019.
P. Biecek. DALEX: Explainers for complex predictive models in R. The Journal of Machine Learning Research, 19(1): 3245–3249, 2018.
P. Biecek and T. Burzykowski. Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models. CRC Press, 2021.
B. E. Boser, I. M. Guyon and V. N. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the fifth annual workshop on Computational learning theory, pages. 144–152 1992.
G. E. Box. Science and statistics. Journal of the American Statistical Association, 71(356): 791–799, 1976.
L. Breiman. Random forests. Machine learning, 45(1): 5–32, 2001a.
L. Breiman. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3): 199–231, 2001b.
T. A. Brown. Confirmatory factor analysis for applied research. Guilford publications, 2015.
A. Buja, D. Cook, D. Asimov and C. Hurley. Computational Methods for High-Dimensional Rotations in Data Visualization. In Handbook of Statistics, pages. 391–413 2005. Elsevier. ISBN 978-0-444-51141-6. URL http://linkinghub.elsevier.com/retrieve/pii/S0169716104240147.
N. Cao, Y.-R. Lin, D. Gotz and F. Du. Z-Glyph: Visualizing outliers in multivariate data. Information Visualization, 17(1): 22–40, 2018. URL https://doi.org/10.1177/1473871616686635 [online; last accessed February 24, 2022].
S. K. Card, T. P. Moran and A. Newell. The psychology of human-computer interaction. Crc Press, 1983.
R. B. Cattell. The scree test for the number of factors. Multivariate behavioral research, 1(2): 245–276, 1966.
J. M. Chambers, W. S. Cleveland, B. Kleiner and P. A. Tukey. Graphical Methods for Data Analysis. 1983.
C. Chang, T. Dwyer and K. Marriott. An evaluation of perceptually complementary views for multivariate data. In 2018 IEEE Pacific Visualization Symposium (PacificVis), pages. 195–204 2018. IEEE.
W. Chang, J. Cheng, J. Allaire, C. Sievert, B. Schloerke, Y. Xie, J. Allen, J. McPherson, A. Dipert and B. Borges. Shiny: Web application framework for r. 2021. URL https://CRAN.R-project.org/package=shiny.
T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen, R. Mitchell, I. Cano, T. Zhou, et al. Xgboost: Extreme Gradient Boosting. 2021. URL https://CRAN.R-project.org/package=xgboost.
H. Chernoff. The Use of Faces to Represent Points in K-Dimensional Space Graphically. Journal of the American Statistical Association, 68(342): 361–368, 1973. URL https://www.jstor.org/stable/2284077.
D. Cook and A. Buja. Manual Controls for High-Dimensional Data Projections. Journal of Computational and Graphical Statistics, 6(4): 464–480, 1997. URL http://www.jstor.org/stable/1390747.
D. Cook, A. Buja, J. Cabrera and C. Hurley. Grand Tour and Projection Pursuit. Journal of Computational and Graphical Statistics, 4(3): 155, 1995. URL https://www.jstor.org/stable/1390844?origin=crossref.
D. Cook, A. Buja, E.-K. Lee and H. Wickham. Grand Tours, Projection Pursuit Guided Tours, and Manual Controls. In Handbook of Data Visualization, pages. 295–314 2008. Berlin, Heidelberg: Springer Berlin Heidelberg. ISBN 978-3-540-33036-3 978-3-540-33037-0. URL http://link.springer.com/10.1007/978-3-540-33037-0_13.
D. Cook, U. Laa and G. Valencia. Dynamical projections for the visualization of PDFSense data. Eur. Phys. J. C, 78(9): 742, 2018. DOI 10.1140/epjc/s10052-018-6205-2.
J. Dastin. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters, 2018. URL https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G.
E. Dimara and C. Perin. What is interaction for data visualization? IEEE transactions on visualization and computer graphics, 26(1): 119–129, 2019.
M. Díaz, I. Johnson, A. Lazar, A. M. Piper and D. Gergle. Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 chi conference on human factors in computing systems, pages. 1–14 2018.
C. Duffy. Apple co-founder Steve Wozniak says Apple Card discriminated against his wife. CNN, 2019. URL https://www.cnn.com/2019/11/10/business/goldman-sachs-apple-card-discrimination/index.html.
M. Espadoto, R. M. Martins, A. Kerren, N. S. T. Hirata and A. C. Telea. Toward a Quantitative Survey of Dimension Reduction Techniques. IEEE Transactions on Visualization and Computer Graphics, 27(3): 2153–2173, 2021. DOI 10.1109/TVCG.2019.2944182.
R. Evans and M. Boreland. Multivariate Data Analytics in PV ManufacturingFour Case Studies Using Manufacturing Datasets. IEEE Journal of Photovoltaics, 8(1): 38–47, 2017.
M. Feurer, A. Klein, K. Eggensperger, J. Springenberg, M. Blum and F. Hutter. Efficient and robust automated machine learning. Advances in neural information processing systems, 28: 2015.
R. A. Fisher. The logic of inductive inference. Journal of the royal statistical society, 98(1): 39–82, 1935. Publisher: JSTOR.
K. R. Gabriel. The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58(3): 453–467, 1971.
F. Galton. Regression towards mediocrity in hereditary stature. The Journal of the Anthropological Institute of Great Britain and Ireland, 15: 246–263, 1886. Publisher: JSTOR.
P. Gijsbers, F. Pfisterer, J. N. van Rijn, B. Bischl and J. Vanschoren. Meta-learning for symbolic hyperparameter defaults. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, pages. 151–152 2021.
K. B. Gorman, T. D. Williams and W. R. Fraser. Ecological sexual dimorphism and environmental variability within a community of Antarctic penguins (genus Pygoscelis). PloS one, 9(3): e90081, 2014.
A. Gosiewska and P. Biecek. IBreakDown: Uncertainty of model explanations for non-additive predictive models. arXiv preprint arXiv:1903.11420, 2019.
A. Gracia, S. González, V. Robles, E. Menasalvas and T. von Landesberger. New Insights into the Suitability of the Third Dimension for Visualizing Multivariate/Multidimensional Data: A Study Based on Loss of Quality Quantification. Information Visualization, 15(1): 3–30, 2016. URL https://doi.org/10.1177/1473871614556393.
B. Greenwell. Fastshap: Fast Approximate Shapley Values. 2020. URL https://CRAN.R-project.org/package=fastshap.
B. Greenwell, B. Boehmke, J. Cunningham and G. B. M. Developers. Gbm: Generalized Boosted Regression Models. 2020. URL https://CRAN.R-project.org/package=gbm.
G. Grinstein, M. Trutschl and U. Cvek. High-Dimensional Visualizations. 14, 2002.
P. Hennerdal. Beyond the periphery: Child and adult understanding of world map continuity. Annals of the Association of American Geographers, 105(4): 773–790, 2015.
A. R. Hevner, S. T. March, J. Park and S. Ram. Design science in information systems research. MIS quarterly, 75–105, 2004.
A. M. Horst, A. P. Hill and K. B. Gorman. Palmerpenguins: Palmer Archipelago (Antarctica) penguin data. 2020. URL https://allisonhorst.github.io/palmerpenguins/.
W. Huber, V. J. Carey, R. Gentleman, S. Anders, M. Carlson, B. S. Carvalho, H. C. Bravo, S. Davis, L. Gatto and T. Girke. Orchestrating high-throughput genomic analysis with Bioconductor. Nature methods, 12(2): 115–121, 2015.
F. Hutter, L. Kotthoff and J. Vanschoren. Automated machine learning: Methods, systems, challenges. Springer Nature, 2019.
R. J. Hyndman. Computing and graphing highest density regions. The American Statistician, 50(2): 120–126, 1996.
Y. Kang, R. J. Hyndman and K. Smith-Miles. Visualising forecasting algorithm performance using time series instance spaces. International Journal of Forecasting, 33(2): 345–358, 2017. URL https://www.sciencedirect.com/science/article/pii/S0169207016301030 [online; last accessed February 15, 2022].
D. A. Keim. Designing pixel-oriented visualization techniques: Theory and applications. IEEE Transactions on visualization and computer graphics, 6(1): 59–78, 2000.
J. Kleinberg and E. Tardos. Algorithm design. Pearson Education India, 2006.
A. A. Kodiyan. An overview of ethical issues in using AI systems in hiring with a case study of Amazon’s AI based hiring tool. Researchgate Preprint, 2019.
K. Kominsarczyk, P. Kozminski, S. Maksymiuk and P. Biecek. Treeshap. 2021. URL https://github.com/ModelOriented/treeshap.
U. Laa, D. Cook and G. Valencia. A slice tour for finding hollowness in high-dimensional data. Journal of Computational and Graphical Statistics, 29(3): 681–687, 2020.
J. Larson, S. Mattu, L. Kirchner and J. Angwin. How We Analyzed the COMPAS Recidivism Algorithm. ProPublica, 2016. URL https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm?token=RPR1E2qtzJltfJ0tS-gB_41kmfoWZAu4.
S. Lee, D. Cook, N. da Silva, U. Laa, N. Spyrison, E. Wang and H. S. Zhang. The state-of-the-art on tours for dynamic visualization of high-dimensional data. WIREs Computational Statistics, e1573, 2021. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.1573.
S. Lee, U. Laa and D. Cook. Casting Multiple Shadows: High-Dimensional Interactive Data Visualisation with Tours and Embeddings. arXiv preprint arXiv:2012.06077, 2020.
S. Leone. FIFA 20 complete player dataset. 2020. URL https://kaggle.com/stefanoleone992/fifa-20-complete-player-dataset.
J. Lewis, L. Van der Maaten and V. de Sa. A behavioral investigation of dimensionality reduction. In Proceedings of the Annual Meeting of the Cognitive Science Society, 2012.
A. Liaw and M. Wiener. Classification and regression by randomForest. R news, 2(3): 18–22, 2002.
S. Liu, D. Maljovec, B. Wang, P.-T. Bremer and V. Pascucci. Visualizing High-Dimensional Data: Advances in the Past Decade. IEEE Transactions on Visualization and Computer Graphics, 23(3): 1249–1268, 2017. DOI 10.1109/TVCG.2016.2640960.
A. A. Lubischew. On the use of discriminant functions in taxonomy. Biometrics, 455–477, 1962. DOI 10.2307/2527894.
S. M. Lundberg, G. G. Erion and S.-I. Lee. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888, 2018.
S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Proceedings of the 31st international conference on neural information processing systems, pages. 4768–4777 2017.
L. van der Maaten, E. Postma and J. Van den Herik. Dimensionality Reduction: A Comparative Review. J Mach Learn Res, 10(66-71): 13, 2009.
K. Marriott, F. Schreiber, T. Dwyer, K. Klein, N. H. Riche, T. Itoh, W. Stuerzlinger and B. H. Thomas. Immersive Analytics. Springer, 2018.
J. Matejka and G. Fitzmaurice. Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI ’17, pages. 1290–1294 2017. Denver, Colorado, USA: ACM Press. ISBN 978-1-4503-4655-9. URL http://dl.acm.org/citation.cfm?doid=3025453.3025912.
C. Molnar. Interpretable machine learning. Lulu. com, 2020. URL christophm.github.io/interpretable-ml-book/.
T. Munzner. Visualization analysis and design. AK Peters/CRC Press, 2014.
L. Nelson, D. Cook and C. Cruz-Neira. XGobi vs the C2: Results of an Experiment Comparing Data Visualization in a 3-D Immer- sive Virtual Reality Environment with a 2-D Workstation Display. Computational Statistics, 14(1): 39–52, 1998.
L. G. Nonato and M. Aupetit. Multidimensional projection for visual analytics: Linking techniques with distortions, tasks, and layout enrichment. IEEE Transactions on Visualization and Computer Graphics, 25(8): 2650–2673, 2018.
M. O’Hara-Wild, S. Pearce, R. Nakagawara, S. Gupta, D. Vanichkina, E. Tanaka, T. Fung and R. Hyndman. Gghdr: Visualisation of Highest Density Regions in ’ggplot2’. 2022. URL https://CRAN.R-project.org/package=gghdr.
C. O’Neil. Weapons of math destruction: How big data increases inequality and threatens democracy. Crown, 2016.
M. d’. Ocagne. Coordonne’es paralle’les et axiales. Me’thode de transformation ge’ome’trique et proce’de’ nouveau de calcul graphique de’duits de la conside’ration des coordonne’es paralle’les. Paris: Gauthier-Villars, 1885.
S. Palan and C. Schitter. Prolific. A subject pool for online experiments. Journal of Behavioral and Experimental Finance, 17: 22–27, 2018.
A. Panagiotelis. Manifold Learning. 2020.
K. Pearson. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11): 559–572, 1901.
T. L. Pedersen and D. Robinson. Gganimate: A grammar of animated graphics. 2020. URL https://CRAN.R-project.org/package=gganimate.
F. Pfisterer, J. N. van Rijn, P. Probst, A. C. Müller and B. Bischl. Learning multiple defaults for machine learning algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference Companion, pages. 241–242 2021.
P. Probst, A.-L. Boulesteix and B. Bischl. Tunability: Importance of hyperparameters of machine learning algorithms. The Journal of Machine Learning Research, 20(1): 1934–1965, 2019a.
P. Probst, M. N. Wright and A.-L. Boulesteix. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: data mining and knowledge discovery, 9(3): e1301, 2019b.
M. T. Ribeiro, S. Singh and C. Guestrin. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages. 1135–1144 2016. New York, NY, USA: Association for Computing Machinery. ISBN 978-1-4503-4232-2. URL https://doi.org/10.1145/2939672.2939778.
J. C. Roberts. State of the art: Coordinated & multiple views in exploratory visualization. In Fifth international conference on coordinated and multiple views in exploratory visualization (CMV 2007), pages. 61–71 2007. IEEE.
O. Rodrigues. Des lois géométriques qui régissent les déplacements d’un système solide dans l’espace: Et de la variation des cordonnées provenant de ces déplacements considérés indépendamment des causes qui peuvent les produire. Journal de Mathématiques Pures et Appliquées, 5: 380–440, 1840.
L. Scrucca, M. Fop, T. B. Murphy and A. E. Raftery. Mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models. The R journal, 8(1): 289–317, 2016. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096736/.
M. Sedlmair, T. Munzner and M. Tory. Empirical Guidance on Scatterplot and Dimension Reduction Technique Choices. IEEE Transactions on Visualization & Computer Graphics, (12): 2634–2643, 2013.
L. S. Shapley. A value for n-person games. Princeton University Press, 1953.
Y. Shi, G. Ke, D. Soukhavong, J. Lamb, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, et al. Lightgbm: Light Gradient Boosting Machine. 2022. URL https://CRAN.R-project.org/package=lightgbm.
G. Shmueli. To explain or to predict? Statistical science, 25(3): 289–310, 2010.
A. Shrikumar, P. Greenside and A. Kundaje. Learning important features through propagating activation differences. In International Conference on Machine Learning, pages. 3145–3153 2017. PMLR.
A. Shrikumar, P. Greenside, A. Shcherbina and A. Kundaje. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
C. Sievert. Interactive web-based data visualization with r, plotly, and shiny. Chapman; Hall/CRC, 2020. URL https://plotly-r.com.
C. Sievert. Plotly for R. 2018. URL https://plotly-book.cpsievert.me.
K. Simonyan, A. Vedaldi and A. Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. In In Workshop at International Conference on Learning Representations, 2014. Citeseer.
J. P. Snyder. Map projections–A working manual. US Government Printing Office, 1987.
N. Spyrison and D. Cook. Spinifex: An R Package for Creating a Manual Tour of Low-dimensional Projections of Multivariate Data. The R Journal, 12(1): 243, 2020. URL https://journal.r-project.org/archive/2020/RJ-2020-027/index.html.
E. Strumbelj and I. Kononenko. An efficient explanation of individual classifications using game theory. The Journal of Machine Learning Research, 11: 1–18, 2010.
D. F. Swayne, D. T. Lang, A. Buja and D. Cook. GGobi: Evolving from XGobi into an extensible framework for interactive data visualization. Computational Statistics & Data Analysis, 43(4): 423–444, 2003. URL http://www.sciencedirect.com/science/article/pii/S0167947302002864.
J. W. Tukey. Exploratory data analysis. Pearson, 1977.
A. Unwin and P. Valero-Mora. Ensemble Graphics. Journal of Computational and Graphical Statistics, 27(1): 157–165, 2018. URL https://doi.org/10.1080/10618600.2017.1383264.
L. Vanni, M. Ducoffe, C. Aguilar, F. Precioso and D. Mayaffre. Textual Deconvolution Saliency (TDS): A deep tool box for linguistic analysis. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages. 548–557 2018.
J. Venna and S. Kaski. Visualizing gene interaction graphs with local multidimensional scaling. In ESANN, pages. 557–562 2006. Citeseer.
J. Wagner Filho, M. Rey, C. Freitas and L. Nedel. Immersive Visualization of Abstract Information: An Evaluation on Dimensionally-Reduced Data Scatterplots. 2018.
B.-T. Wang, T. J. Hobbs, S. Doyle, J. Gao, T.-J. Hou, P. M. Nadolsky and F. I. Olness. Mapping the sensitivity of hadronic experiments to nucleon structure. Physical Review D, 98(9): 094030, 2018. DOI 10.1103/PhysRevD.98.094030.
H. Wickham. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York, 2016. URL https://ggplot2.tidyverse.org.
H. Wickham, D. Cook and H. Hofmann. Visualizing statistical models: Removing the blindfold: Visualizing Statistical Models. Statistical Analysis and Data Mining: The ASA Data Science Journal, 8(4): 203–225, 2015. URL http://doi.wiley.com/10.1002/sam.11271.
H. Wickham, D. Cook, H. Hofmann and A. Buja. Tourr: An r package for exploring multivariate data with projections. Journal of Statistical Software, 40(2): 2011. URL http://www.jstatsoft.org/v40/i02/.
H. Wickham and G. Grolemund. R for Data Science. O’Reily Media, 2017. URL https://r4ds.had.co.nz/index.html.
B. J. Winer. Statistical principles in experimental design. 1962.
M. N. Wright and A. Ziegler. Ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. Journal of Statistical Software, 77(1): 1–17, 2017. DOI 10.18637/jss.v077.i01.
Y. Xie. Bookdown: Authoring Books and Technical Documents with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC, 2016. URL https://github.com/rstudio/bookdown.
L. Yang. 3D Grand Tour for Multidimensional Data and Clusters. In Advances in Intelligent Data Analysis, Eds D. J. Hand, J. N. Kok and M. R. Berthold pages. 173–184 1999. Berlin, Heidelberg: Springer. ISBN 978-3-540-48412-7. DOI 10.1007/3-540-48412-4_15.
L. Yang. Interactive exploration of very large relational datasets through 3D dynamic projections. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’00, pages. 236–243 2000. Boston, Massachusetts, United States: ACM Press. ISBN 978-1-58113-233-5. URL http://portal.acm.org/citation.cfm?doid=347090.347134.
Q. Yao, M. Wang, Y. Chen, W. Dai, Y.-F. Li, W.-W. Tu, Q. Yang and Y. Yu. Taking Human out of Learning Applications: A Survey on Automated Machine Learning. arXiv:1810.13306 [cs, stat], 2019. URL http://arxiv.org/abs/1810.13306 [online; last accessed February 11, 2022].