Data Science and Predictive Analytics
The first edition of the textbook Data Science and Predictive Analytics: Biomedical and Health Applications using R, authored by Ivo D. Dinov, was published in August 2018 by Springer.[1] The second edition of the book was printed in 2023.[2]
This textbook covers some of the core mathematical foundations, computational techniques, and artificial intelligence approaches used in data science research and applications.[3]
By using the statistical computing platform R and a broad range of biomedical case-studies, the 23 chapters of the book first edition provide explicit examples of importing, exporting, processing, modeling, visualizing, and interpreting large, multivariate, incomplete, heterogeneous, longitudinal, and incomplete datasets (big data).[4]
Structure[edit | edit source]
First edition table of contents[edit | edit source]
The first edition of the Data Science and Predictive Analytics (DSPA) textbook[1] is divided into the following 23 chapters, each progressively building on the previous content.
<templatestyles src="Div col/styles.css"/>
- Motivation
- Foundations of R
- Managing Data in R
- Data Visualization
- Linear Algebra & Matrix Computing
- Dimensionality Reduction
- Lazy Learning: Classification Using Nearest Neighbors
- Probabilistic Learning: Classification Using Naive Bayes
- Decision Tree Divide and Conquer Classification
- Forecasting Numeric Data Using Regression Models
- Black Box Machine-Learning Methods: Neural Networks and Support Vector Machines
- Apriori Association Rules Learning
- k-Means Clustering
- Model Performance Assessment
- Improving Model Performance
- Specialized Machine Learning Topics
- Variable/Feature Selection
- Regularized Linear Modeling and Controlled Variable Selection
- Big Longitudinal Data Analysis
- Natural Language Processing/Text Mining
- Prediction and Internal Statistical Cross Validation
- Function Optimization
- Deep Learning, Neural Networks
Second edition table of contents[edit | edit source]
The significantly reorganized revised edition of the book (2023)[2] expands and modernizes the presented mathematical principles, computational methods, data science techniques, model-based machine learning and model-free artificial intelligence algorithms. The 14 chapters of the new edition start with an introduction and progressively build foundational skills to naturally reach biomedical applications of deep learning.
- Introduction
- Basic Visualization and Exploratory Data Analytics
- Linear Algebra, Matrix Computing, and Regression Modeling
- Linear and Nonlinear Dimensionality Reduction
- Supervised Classification
- Black Box Machine Learning Methods
- Qualitative Learning Methods—Text Mining, Natural Language Processing, and Apriori Association Rules Learning
- Unsupervised Clustering
- Model Performance Assessment, Validation, and Improvement
- Specialized Machine Learning Topics
- Variable Importance and Feature Selection
- Big Longitudinal Data Analysis
- Function Optimization
- Deep Learning, Neural Networks
Reception[edit | edit source]
The materials in the Data Science and Predictive Analytics (DSPA) textbook have been peer-reviewed in the Journal of the American Statistical Association,[5] International Statistical Institute’s ISI Review Journal,[3] and the Journal of the American Library Association.[4] Many scholarly publications reference the DSPA textbook.[6][7]
As of January 17, 2021, the electronic version of the book first edition (<templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>ISBN 978-3-319-72347-1) is freely available on SpringerLink[8] and has been downloaded over 6 million times. The textbook is globally available in print (hardcover and softcover) and electronic formats (PDF and EPub) in many college and university libraries[9] and has been used for data science, computational statistics, and analytics classes at various institutions.[10]
References[edit | edit source]