Learn to perform sophisticated data analysis using SQL and Excel
SQL is the essential language for querying databases, and Excel is the most popular tool for data presentation and analysis. Combined, they create a powerful, accessible tool for business data analysis. Many important types of analysis do not require complex and expensive data mining tools. The answers are on your desktop.
This no–nonsense guide, written by a leading expert on business data mining, shows you how to design and perform sophisticated data analysis using SQL and Excel. The highly regarded first edition has been revised to cover the newest enhancements to SQL and Excel, including new techniques and real–world examples. This edition features the up–to–date information business managers and data analysts need.
The book begins with the basics of SQL for data mining, Excel to present results, and simple ideas from statistics to understand your data. Core analytic techniques are explained as you learn to run them on real data using Excel and SQL. The chapters progress from basic queries to increasingly detailed applications as you learn why and when to perform specific types of analysis, how to design and perform them, and powerful ways of presenting the results. Each step explains the business context, the technical approach, and the implementation in these familiar tools.
As you progress, you′ll discover the importance of geography, how to chart changes in data over time, how to use survival analysis to understand customer tenure and churn, and the factors that affect survival. You will explore methods for analyzing customer purchases patterns, market basket analysis, and association rules. Included are important data mining models in SQL, linear regression models, naive Bayesian models, information on building a customer signature, methods for analyzing results, including cumulative gains charts and ROC charts, best practices for using SQL, and getting the best performance for your queries.
With more than 100 pages of new material, the fully revised second edition of Data Analysis Using SQL and Excel enables you to:
- Understand core analytic techniques that work with SQL and Excel
- Analyze and interpret data in a table
- Present data professionally in Excel charts
- Apply the chi–square measure and other important statistical techniques in both SQL and Excel
- Understand best practices for SQL queries, with a chapter devoted to performance
- Use survival analysis to understand time–to–event problems, both for single events and for repeated events
- Use market basket analysis to understand purchasing behavior
- Identify the analytic approach that gets the result you′re looking for
- Avoid common pitfalls
- Maximize the value of the data you have about your customers and your business
The companion website includes datasets for all examples in the book as well as related Excel spreadsheets.
Chapter 1 A Data Miner Looks at SQL 1
Chapter 2 What s in a Table? Getting Started with Data Exploration 49
Chapter 3 How Different Is Different? 97
Chapter 4 Where Is It All Happening? Location, Location, Location 145
Chapter 5 It s a Matter of Time 197
Chapter 6 How Long Will Customers Last? Survival Analysis to Understand Customers and Their Value 255
Chapter 7 Factors Affecting Survival: The What and Why of Customer Tenure 315
Chapter 8 Customer Purchases and Other Repeated Events 367
Chapter 9 What s in a Shopping Cart? Market Basket Analysis 421
Chapter 10 Association Rules and Beyond 465
Chapter 11 Data Mining Models in SQL 507
Chapter 12 The Best–Fit Line: Linear Regression Models 561
Chapter 13 Building Customer Signatures for Further Analysis 609
Chapter 14 Performance Is the Issue: Using SQL Effectively 655
Appendix Equivalent Constructs Among Databases 703
GORDON S. LINOFF has been working with databases for more decades than he cares to admit. He starting learning about SQL by memorizing the SQL 92 standard while leading a development team (at the now–defunct Thinking Machines Corporation) writing the first high–performance database focused on the complex queries needed for decision support.
After that endeavor, Gordon co–founded Data Miners in 1998, a consulting practice devoted to data mining, analytics, and big data. A constant theme in his work is data and often data in relational databases. His SQL skills have only gotten stronger over the years. In 2014, he was the top contributor to Stack Overflow, the leading question–and–answer–site for technical questions.
His other books include the bestselling Data Mining Techniques, Third Edition; Mastering Data Mining; and Mining the Web which focus on data mining and analysis. This book follows on the popularity of the first edition, with a practical focus on how to actually get and interpret results.