Exploratory Data Analysis with MATLAB, Third Ed, Martinez, Martinez, Solka..
Praise for the Second Edition: "The authors present an intuitive and easy-to-read book. ... accompanied by many examples, proposed exercises, good references, and comprehensive appendices that initiate the reader unfamiliar with MATLAB." --Adolfo Alvarez Pinto, International Statistical Review " Practitioners of EDA who use MATLAB will want a copy of this book. ... The authors have done a great service by bringing together so many EDA routines, but their main accomplishment in this dynamic text is providing the understanding and tools to do EDA. --David A Huckaby, MAA Reviews Exploratory Data Analysis (EDA) is an important part of the data analysis process. The methods presented in this text are ones that should be in the toolkit of every data scientist. As computational sophistication has increased and data sets have grown in size and complexity, EDA has become an even more important process for visualizing and summarizing data before making assumptions to generate hypotheses and models. Exploratory Data Analysis with MATLAB, Third Edition presents EDA methods from a computational perspective and uses numerous examples and applications to show how the methods are used in practice. The authors use MATLAB code, pseudo-code, and algorithm descriptions to illustrate the concepts. The MATLAB code for examples, data sets, and the EDA Toolbox are available for download on the book's website. New to the Third Edition Random projections and estimating local intrinsic dimensionality Deep learning autoencoders and stochastic neighbor embedding Minimum spanning tree and additional cluster validity indices Kernel density estimation Plots for visualizing data distributions, such as bean plots and violin plots A chapter on visualizing categorical dataThis book describes the various methods used for exploratory data analysis with an emphasis on MATLAB implementation. It covers approaches for visualizing data, data tours and animations, clustering (or unsupervised learning), dimensionality reduction, and more. A set of graphical user interfaces (GUIs) allows the users to apply the ideas to their own data. This book is suitable for advanced undergraduates and graduates.
Angel R. Martinez, Jeffrey Solka, Wendy L. Martinez
Number Of Pages
Chapman and Hall/CRC Computer Science and Data Analysis
CRC Press LLC
LC Classification Number
Table Of Content
Introduction to Exploratory Data Analysis. Dimensionality Reduction - Linear Methods. Dimensionality Reduction - Nonlinear Methods. Data Tours. Finding Clusters. Model-Based Clustering. Smoothing Scatterplots. Visualizing Clusters. Distribution Shapes. Multivariate Visualization. Appendices.Part I Introduction to Exploratory Data Analysis What is Exploratory Data Analysis Overview of the Text A Few Words about Notation Data Sets Used in the Book Unstructured Text Documents Gene Expression Data Oronsay Data Set Software Inspection Transforming Data Power Transformations Standardization Sphering the Data Further Reading Exercises Part II EDA as Pattern Discovery Dimensionality Reduction -- Linear Methods Introduction Principal Component Analysis -- PCA PCA Using the Sample Covariance Matrix PCA Using the Sample Correlation Matrix How Many Dimensions Should We Keep? Singular Value Decomposition -- SVD Nonnegative Matrix Factorization Factor Analysis Fisher''s Linear Discriminant Random Projections Intrinsic Dimensionality Nearest Neighbor Approach Correlation Dimension Maximum Likelihood Approach Estimation Using Packing Numbers Estimation of Local Dimension Summary and Further Reading Exercises Dimensionality Reduction -- Nonlinear Methods Multidimensional Scaling -- MDS Metric MDS Nonmetric MDS Manifold Learning Locally Linear Embedding Isometric Feature Mapping -- ISOMAP Hessian Eigenmaps Artificial Neural Network Approaches Self-Organizing Maps Generative Topographic Maps Curvilinear Component Analysis Autoencoders Stochastic Neighbor Embedding Summary and Further Reading Exercises Data Tours Grand Tour Torus Winding Method Pseudo Grand Tour Interpolation Tours Projection Pursuit Projection Pursuit Indexes Posse Chi-Square Index Moment Index Independent Component Analysis Summary and Further Reading Exercises Finding Clusters Introduction Hierarchical Methods Optimization Methods -- k-Means Spectral Clustering Document Clustering Nonnegative Matrix Factorization -- Revisited Probabilistic Latent Semantic Analysis Minimal Spanning Trees and Clustering Definitions Minimum Spanning Tree Clustering Evaluating the Clusters Rand Index Cophenetic Correlation Upper Tail Rule Silhouette Plot Gap Statistic Cluster Validity Indices Summary and Further Reading Exercises Model-Based Clustering Overview of Model-Based Clustering Finite Mixtures Multivariate Finite Mixtures Component Models -- Constraining the Covariances Expectation-Maximization Algorithm Hierarchical Agglomerative Model-Based Clustering Model-Based Clustering MBC for Density Estimation and Discriminant Analysis Introduction to Pattern Recognition Bayes Decision Theory Estimating Probability Densities with MBC Generating Random Variables from a Mixture Model Summary and Further Reading Exercises Smoothing Scatterplots Introduction Loess Robust Loess Residuals and Diagnostics with Loess Residual Plots Spread Smooth Loess Envelopes -- Upper and Lower Smooths Smoothing Splines Regression with Splines Smoothing Splines Smoothing Splines for Uniformly Spaced Data Choosing the Smoothing Parameter Bivariate Distribution Smooths Pairs of Middle Smoothings Polar Smoothing Curve Fitting Toolbox Summary and Further Reading Exercises Part III Graphical Methods for EDA Visualizing Clusters Dendrogram Treemaps Rectangle Plots ReClus Plots Data Image Summary and Further Reading Exercises Distribution Shapes Histograms Univariate Histograms Bivariate Histograms Kernel Density Univariate Kernel Density Estimation Multivariate Kernel Density Estimation Boxplots The Basic Boxplot Variations of the Basic Boxplot Violin Plots Beeswarm Plot Bean Plot Quantile Plots Probability Plots Quantile-Quantile Plot Quantile Plot Bagplots Rangefinder Boxplot Summary and Further Reading Exercises Multivariate Visualization Glyph Plots Scatterplots 2-D and 3-D Scatterplots Scatterplot Matrices Scatterplots with Hexagonal Binning Dynamic Graphics Identification of Data Linking Brushing Coplots Dot Charts Basic Dot Chart Multiway Dot Chart Plotting Points as Curves Parallel Coordinate Plots Andrews'' Curves Andrews'' Images More Plot Matrices Data Tours Revisited Grand Tour Permutation Tour Biplots Summary and Further Reading Exercises Visualizing Categorical Data Discrete Distributions Binomial