Handbook of Statistical Analysis and Data Mining Applications, Second Edition, is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers, both academic and industrial, through all stages of data analysis, model building and implementation. The handbook helps users discern technical and business problems, understand the strengths and weaknesses of modern data mining algorithms and employ the right statistical methods for practical application.
This book is an ideal reference for users who want to address massive and complex datasets with novel statistical approaches and be able to objectively evaluate analyses and solutions. It has clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques and discusses their application to real problems in ways accessible and beneficial to practitioners across several areas-from science and engineering, to medicine, academia and commerce.
- Includes input by practitioners for practitioners
- Includes tutorials in numerous fields of study that provide step-by-step instruction on how to use supplied tools to build models
- Contains practical advice from successful real-world implementations
- Brings together, in a single resource, all the information a beginner needs to understand the tools and issues in data mining to build successful data mining solutions
- Features clear, intuitive explanations of novel analytical tools and techniques, and their practical applications
Part 1: History Of Phases Of Data Analysis, Basic Theory, And The Data Mining Process 1. The Background for Data Mining Practice 2. Theoretical Considerations for Data Mining 3. The Data Mining and Predictive Analytic Process 4. Data Understanding and Preparation 5. Feature Selection 6. Accessory Tools for Doing Data Mining
Part 2: The Algorithms And Methods In Data Mining And Predictive Analytics And Some Domain Areas 7. Basic Algorithms for Data Mining: A Brief Overview 8. Advanced Algorithms for Data Mining 9. Classification 10. Numerical Prediction 11. Model Evaluation and Enhancement 12. Predictive Analytics for Population Health and Care 13. Big Data in Education: New Efficiencies for Recruitment, Learning, and Retention of Students and Donors 14. Customer Response Modeling 15. Fraud Detection
Part 3: Tutorials And Case Studies Tutorial A Example of Data Mining Recipes Using Windows 10 and Statistica 13 Tutorial B Using the Statistica Data Mining Workspace Method for Analysis of Hurricane Data (Hurrdata.sta) Tutorial C Case Study-Using SPSS Modeler and STATISTICA to Predict Student Success at High-Stakes Nursing Examinations (NCLEX) Tutorial D Constructing a Histogram in KNIME Using MidWest Company Personality Data Tutorial E Feature Selection in KNIME Tutorial F Medical/Business Tutorial Tutorial G A KNIME Exercise, Using Alzheimer's Training Data of Tutorial F Tutorial H Data Prep 1-1: Merging Data Sources Tutorial I Data Prep 1-2: Data Description Tutorial J Data Prep 2-1: Data Cleaning and Recoding Tutorial K Data Prep 2-2: Dummy Coding Category Variables Tutorial L Data Prep 2-3: Outlier Handling Tutorial M Data Prep 3-1: Filling Missing Values With Constants Tutorial N Data Prep 3-2: Filling Missing Values With Formulas Tutorial O Data Prep 3-3: Filling Missing Values With a Model Tutorial P City of Chicago Crime Map: A Case Study Predicting Certain Kinds of Crime Using Statistica Data Miner and Text Miner Tutorial Q Using Customer Churn Data to Develop and Select a Best Predictive Model for Client Defection Using STATISTICA Data Miner 13 64-bit for Windows 10 Tutorial R Example With C&RT to Predict and Display Possible Structural Relationships Tutorial S Clinical Psychology: Making Decisions About Best Therapy for a Client
Part 4: Model Ensembles, Model Complexity; Using the Right Model for the Right Use, Significance, Ethics, and the Future, and Advanced Processes 16. The Apparent Paradox of Complexity in Ensemble Modeling 17. The "Right Model" for the "Right Purpose": When Less Is Good Enough 18. A Data Preparation Cookbook 19. Deep Learning 20. Significance versus Luck in the Age of Mining: The Issues of P-Value "Significance" and "Ways to Test Significance of Our Predictive Analytic Models" 21. Ethics and Data Analytics 22. IBM Watson
Dr. Robert Nisbet was trained initially in Ecology and Ecosystems Analysis. He has over 30 years of experience in complex systems analysis and modeling, most recently as a Researcher (University of California, Santa Barbara). In business, he pioneered the design and development of configurable data mining applications for retail sales forecasting, and Churn, Propensity-to-buy, and Customer Acquisition in Telecommunications, Insurance, Banking, and Credit industries. In addition to data mining, he has expertise in data warehousing technology for Extract, Transform, and Load (ETL) operations, Business Intelligence reporting, and data quality analyses. He is lead author of the "Handbook of Statistical Analysis & Data Mining Applications (Academic Press, 2009), and a co-author of "Practical Text Mining" (Academic Press, 2012), and co-author of "Practical Predictive Analytics and Decisioning Systems for Medicine (Academic Press, 2015). Currently, he serves as an Instructor in the University of California, Irvine Predictive Analytics Certificate Program, teaching online and on-campus courses in Effective Data preparation, and Applications of Predictive Analytics. Additionally Bob is in the last stages of writing another book on 'Data Preparation for Predictive Analytic Modeling.
Dr. Gary Miner received a B.S. from Hamline University, St. Paul, MN, with biology, chemistry, and education majors; an M.S. in zoology and population genetics from the University of Wyoming; and a Ph.D. in biochemical genetics from the University of Kansas as the recipient of a NASA pre-doctoral fellowship. He pursued additional National Institutes of Health postdoctoral studies at the U of Minnesota and U of Iowa eventually becoming immersed in the study of affective disorders and Alzheimer's disease.
In 1985, he and his wife, Dr. Linda Winters-Miner, founded the Familial Alzheimer's Disease Research Foundation, which became a leading force in organizing both local and international scientific meetings, bringing together all the leaders in the field of genetics of Alzheimer's from several countries, resulting in the first major book on the genetics of Alzheimer's disease. In the mid-1990s, Dr. Miner turned his data analysis interests to the business world, joining the team at StatSoft and deciding to specialize in data mining. He started developing what eventually became the Handbook of Statistical Analysis and Data Mining Applications (co-authored with Drs. Robert A. Nisbet and John Elder), which received the 2009 American Publishers Award for Professional and Scholarly Excellence (PROSE). Their follow-up collaboration, Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, also received a PROSE award in February of 2013. Gary was also co-author of "Practical Predictive Analytics and Decisioning Systems for Medicine (Academic Press, 2015). Overall, Dr. Miner's career has focused on medicine and health issues, and the use of data analytics (statistics and predictive analytics) in analyzing medical data to decipher fact from fiction.
Gary has also served as Merit Reviewer for PCORI (Patient Centered Outcomes Research Institute) that awards grants for predictive analytics research into the comparative effectiveness and heterogeneous treatment effects of medical interventions including drugs among different genetic groups of patients; additionally he teaches on-line classes in 'Introduction to Predictive Analytics', 'Text Analytics', 'Risk Analytics', and 'Healthcare Predictive Analytics' for the University of California-Irvine. Recently, until 'official retirement' 18 months ago, he spent most of his time in his primary role as Senior Analyst-Healthcare Applications Specialist for Dell | Information Management Group, Dell Software (through Dell's acquisition of StatSoft (www.StatSoft.com) in April 2014). Currently Gary is working on two new short popular books on 'Healthcare Solutions for the USA' and 'Patient-Doctor Genomics Stories'.
Dr. Kenneth Yale has a track record of Business Development, Product Innovation, and Strategy in both entrepreneurial and large companies across healthcare industry verticals, including Health Payers, Life Sciences, and Government Programs. He is an agile executive who identifies future industry trends and seizes opportunities to build sustainable businesses. His accomplishments include innovations in Health Insurance, Care Management, Data Science, Big Data Healthcare Analytics, Clinical Decision Support, and Precision Medicine.
His prior experience includes: medical director and vice president of clinical solutions at ActiveHealth Management/Aetna, chief executive of innovation incubator business unit at UnitedHealth Group Community & State, strategic counsel for Johnson & Johnson, corporate vice president of CorSolutions and Matria Healthcare, senior vice president and general counsel at EduNeering, and founder and CEO of Advanced Health Solutions. Dr. Yale previously worked in the federal government as Commissioned Officer in the US Public Health Service, Legislative Counsel in the US Senate, Special Assistant to the President and Executive Director of the White House Domestic Policy Council, and Chief of Staff of the White House Office of Science and Technology.
Dr. Yale provides leadership and actively participates with industry organizations, including the American Medical Informatics Association - Workgroup on Genomics and Translational Bioinformatics; Bloomberg/BNA Health Insurance Advisory Board; Healthcare Information and Management Systems Society; and the URAC Accreditation Organization. He is a frequent speaker and author on health and technology topics, including the books "Managed Care Compliance Guide, "Clinical Integration: Population Health and Accountable Care, tutorial author in "Practical Predictive Analytics and Decisioning Systems for Medicine, and editor with "Statistical Analysis and Data Mining Applications Second Edition.