Analytics in a Big Data World. The Essential Guide to Data Science and its Applications. Wiley and SAS Business Series

Praise for Analytics in a Big Data World: The Essential Guide to Data Science and its Applications

“Just by continuously exploiting masses of data, companies like Google, Facebook, Uber, Waze, Zillow, etc. have been able to shake up traditional operating models and industries. Putting the required effort and investment in collecting and exploiting new sets of data is simply a must for competitive advantage. The good news is that today, thanks to the rapidly evolving field of technology, we can collect, store and analyze any type of data at lower cost and faster than ever. With this book, the author provides a unique blend of research and business insights into data science and/or analytics, making it a must read for anyone using these technologies to gain sustainable strategic leverage!”
—Sabine Everaet, Europe CIO, The Coca–Cola Company

“Technology companies today, such as eBay, Amazon, and Facebook, touch large volumes of users and generate massive amounts of data, from transactional to behavioral. An understanding of how to extract value from these massive datasets is critical for all of these companies’ ability to compete for customers. Building upon his profound business expertise and knowledge, the author describes the real–world application of varied data science and analytical techniques that would serve as an excellent guide for analytics professionals as they attempt to use the insights residing in the stores of company data to drive decision–making in their organizations.”
—Steve Metz, Senior Director, Global Customer Experience Finance/Analytics & Collections, eBay

Turn Big Data into Big Opportunities

“Where do we start?” More and more businesses are asking this question as the need to strategically manage data intensifies. Analytics in a Big Data World addresses the seemingly Herculean task of coming to grips with multiple channels of data and sculpting them into quantifiable value. This book is for business professionals who want a focused, practical approach to big data analytics. Analytics researcher Bart Baesens focuses on case studies, real–world application, and steps for implementation, using theory and mathematical formulas only when necessary.

The number of strategic applications for big data is constantly expanding. Analytics in a Big Data World provides an approach to data that can be used in customer relationship management, social media, risk management, and beyond. Past behavior can predict future trends so that you can react more effectively. Learn how to begin describing and predicting customers’ complex behavioral patterns, and find out how to apply your analysis in ways that have been proven to add value and target the bottom line.

Big data sets are assets that can be leveraged quickly and inexpensively. As the science of analytics penetrates every industry in every sector, businesses that fail to use their data assets wisely could fall behind the competition. The flood of new information available to businesses has changed the rules of identifying new business opportunities. Analytics in a Big Data World will help you harness the innovations in data science and address the challenges involved in taming big data.

Preface xiii

Acknowledgments xv

Chapter 1 Big Data and Analytics 1

Example Applications 2

Basic Nomenclature 4

Analytics Process Model 4

Job Profiles Involved 6

Analytics 7

Analytical Model Requirements 9

Notes 10

Chapter 2 Data Collection, Sampling, and Preprocessing 13

Types of Data Sources 13

Sampling 15

Types of Data Elements 17

Visual Data Exploration and Exploratory Statistical Analysis 17

Missing Values 19

Outlier Detection and Treatment 20

Standardizing Data 24

Categorization 24

Weights of Evidence Coding 28

Variable Selection 29

Segmentation 32

Notes 33

Chapter 3 Predictive Analytics 35

Target Definition 35

Linear Regression 38

Logistic Regression 39

Decision Trees 42

Neural Networks 48

Support Vector Machines 58

Ensemble Methods 64

Multiclass Classification Techniques 67

Evaluating Predictive Models 71

Notes 84

Chapter 4 Descriptive Analytics 87

Association Rules 87

Sequence Rules 94

Segmentation 95

Notes 104

Chapter 5 Survival Analysis 105

Survival Analysis Measurements 106

Kaplan Meier Analysis 109

Parametric Survival Analysis 111

Proportional Hazards Regression 114

Extensions of Survival Analysis Models 116

Evaluating Survival Analysis Models 117

Notes 117

Chapter 6 Social Network Analytics 119

Social Network Definitions 119

Social Network Metrics 121

Social Network Learning 123

Relational Neighbor Classifier 124

Probabilistic Relational Neighbor Classifier 125

Relational Logistic Regression 126

Collective Inferencing 128

Egonets 129

Bigraphs 130

Notes 132

Chapter 7 Analytics: Putting It All to Work 133

Backtesting Analytical Models 134

Benchmarking 146

Data Quality 149

Software 153

Privacy 155

Model Design and Documentation 158

Corporate Governance 159

Notes 159

Chapter 8 Example Applications 161

Credit Risk Modeling 161

Fraud Detection 165

Net Lift Response Modeling 168

Churn Prediction 172

Recommender Systems 176

Web Analytics 185

Social Media Analytics 195

Business Process Analytics 204

Notes 220

About the Author 223

Index 225

BART BAESENS is an associate professor at KU Leuven (Belgium) and a lecturer at the University of Southampton (United Kingdom), as well as an internationally known data analytics consultant. He is a foremost researcher in the areas of web analytics, customer relationship management, and fraud detection. His findings have been published in well–known international journals including Machine Learning and Management Science. Baesens is also co–author of the book Credit Risk Management: Basic Concepts (Oxford University Press, 2008).

