+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Data Mining and Machine Learning Applications. Edition No. 1

  • Book

  • 496 Pages
  • February 2022
  • John Wiley and Sons Ltd
  • ID: 5841627
DATA MINING AND MACHINE LEARNING APPLICATIONS

The book elaborates in detail on the current needs of data mining and machine learning and promotes mutual understanding among research in different disciplines, thus facilitating research development and collaboration.

Data, the latest currency of today’s world, is the new gold. In this new form of gold, the most beautiful jewels are data analytics and machine learning. Data mining and machine learning are considered interdisciplinary fields. Data mining is a subset of data analytics and machine learning involves the use of algorithms that automatically improve through experience based on data.

Massive datasets can be classified and clustered to obtain accurate results. The most common technologies used include classification and clustering methods. Accuracy and error rates are calculated for regression and classification and clustering to find actual results through algorithms like support vector machines and neural networks with forward and backward propagation. Applications include fraud detection, image processing, medical diagnosis, weather prediction, e-commerce and so forth.

The book features: - A review of the state-of-the-art in data mining and machine learning, - A review and description of the learning methods in human-computer interaction, - Implementation strategies and future research directions used to meet the design and application requirements of several modern and real-time applications for a long time, - The scope and implementation of a majority of data mining and machine learning strategies. - A discussion of real-time problems.

Audience

Industry and academic researchers, scientists, and engineers in information technology, data science and machine and deep learning, as well as artificial intelligence more broadly.

Table of Contents

Preface xvii

1 Introduction to Data Mining 1
Santosh R. Durugkar, Rohit Raja, Kapil Kumar Nagwanshi and Sandeep Kumar

1.1. Introduction 1

1.1.1 Data Mining 1

1.2 Knowledge Discovery in Database (KDD) 2

1.2.1 Importance of Data Mining 3

1.2.2 Applications of Data Mining 3

1.2.3 Databases 4

1.3 Issues in Data Mining 6

1.4 Data Mining Algorithms 7

1.5 Data Warehouse 9

1.6 Data Mining Techniques 10

1.7 Data Mining Tools 11

1.7.1 Python for Data Mining 12

1.7.2 KNIME 13

1.7.3 Rapid Miner 17

References 18

2 Classification and Mining Behavior of Data 21
Srinivas Konda, Kavitarani Balmuri and Kishore Kumar Mamidala

2.1 Introduction 22

2.2 Main Characteristics of Mining Behavioral Data 23

2.2.1 Mining Dynamic/Streaming Data 23

2.2.2 Mining Graph & Network Data 24

2.2.3 Mining Heterogeneous/Multi-Source Information 25

2.2.3.1 Multi-Source and Multidimensional Information 26

2.2.3.2 Multi-Relational Data 26

2.2.3.3 Background and Connected Data 27

2.2.3.4 Complex Data, Sequences, and Events 27

2.2.3.5 Data Protection and Morals 27

2.2.4 Mining High Dimensional Data 28

2.2.5 Mining Imbalanced Data 29

2.2.5.1 The Class Imbalance Issue 29

2.2.6 Mining Multimedia Data 30

2.2.6.1 Common Applications Multimedia Data Mining 31

2.2.6.2 Multimedia Data Mining Utilizations 31

2.2.6.3 Multimedia Database Management 32

2.2.7 Mining Scientific Data 34

2.2.8 Mining Sequential Data 35

2.2.9 Mining Social Networks 36

2.2.9.1 Social-Media Data Mining Reasons 39

2.2.10 Mining Spatial and Temporal Data 40

2.2.10.1 Utilizations of Spatial and Temporal Data Mining 41

2.3 Research Method 44

2.4 Results 48

2.5 Discussion 49

2.6 Conclusion 50

References 51

3 A Comparative Overview of Hybrid Recommender Systems: Review, Challenges, and Prospects 57
Rakhi Seth and Aakanksha Sharaff

3.1 Introduction 58

3.2 Related Work on Different Recommender System 60

3.2.1 Challenges in RS 65

3.2.2 Research Questions and Architecture of This Paper 66

3.2.3 Background 68

3.2.3.1 The Architecture of Hybrid Approach 69

3.2.4 Analysis 78

3.2.4.1 Evaluation Measures 78

3.2.5 Materials and Methods 81

3.2.6 Comparative Analysis With Traditional Recommender System 85

3.2.7 Practical Implications 85

3.2.8 Conclusion & Future Work 94

References 94

4 Stream Mining: Introduction, Tools & Techniques and Applications 99
Naresh Kumar Nagwani

4.1 Introduction 100

4.2 Data Reduction: Sampling and Sketching 101

4.2.1 Sampling 101

4.2.2 Sketching 102

4.3 Concept Drift 103

4.4 Stream Mining Operations 105

4.4.1 Clustering 105

4.4.2 Classification 106

4.4.3 Outlier Detection 107

4.4.4 Frequent Itemsets Mining 108

4.5 Tools & Techniques 109

4.5.1 Implementation in Java 110

4.5.2 Implementation in Python 116

4.5.3 Implementation in R 118

4.6 Applications 120

4.6.1 Stock Prediction in Share Market 120

4.6.2 Weather Forecasting System 121

4.6.3 Finding Trending News and Events 121

4.6.4 Analyzing User Behavior in Electronic Commerce Site (Click Stream) 121

4.6.5 Pollution Control Systems 122

4.7 Conclusion 122

References 122

5 Data Mining Tools and Techniques: Clustering Analysis 125
Rohit Miri, Amit Kumar Dewangan, S.R. Tandan, Priya Bhatnagar and Hiral Raja

5.1 Introduction 126

5.2 Data Mining Task 129

5.2.1 Data Summarization 129

5.2.2 Data Clustering 129

5.2.3 Classification of Data 129

5.2.4 Data Regression 130

5.2.5 Data Association 130

5.3 Data Mining Algorithms and Methodologies 131

5.3.1 Data Classification Algorithm 131

5.3.2 Predication 132

5.3.3 Association Rule 132

5.3.4 Neural Network 132

5.3.4.1 Data Clustering Algorithm 133

5.3.5 In-Depth Study of Gathering Techniques 134

5.3.6 Data Partitioning Method 134

5.3.7 Hierarchical Method 134

5.3.8 Framework-Based Method 136

5.3.9 Model-Based Method 136

5.3.10 Thickness-Based Method 136

5.4 Clustering the Nearest Neighbor 136

5.4.1 Fuzzy Clustering 137

5.4.2 K-Algorithm Means 137

5.5 Data Mining Applications 138

5.6 Materials and Strategies for Document Clustering 140

5.6.1 Features Generation 142

5.7 Discussion and Results 143

5.7.1 Discussion 146

5.7.2 Conclusion 149

References 149

6 Data Mining Implementation Process 151
Kamal K. Mehta, Rajesh Tiwari and Nishant Behar

6.1 Introduction 151

6.2 Data Mining Historical Trends 152

6.3 Processes of Data Analysis 153

6.3.1 Data Attack 153

6.3.2 Data Mixing 153

6.3.3 Data Collection 153

6.3.4 Data Conversion 154

6.3.4.1 Data Mining 154

6.3.4.2 Design Evaluation 154

6.3.4.3 Data Illustration 154

6.3.4.4 Implementation of Data Mining in the Cross-Industry Standard Process 154

6.3.5 Business Understanding 155

6.3.6 Data Understanding 156

6.3.7 Data Preparation 158

6.3.8 Modeling 159

6.3.9 Evaluation 160

6.3.10 Deployment 161

6.3.11 Contemporary Developments 162

6.3.12 An Assortment of Data Mining 162

6.3.12.1 Using Computational & Connectivity Tools 163

6.3.12.2 Web Mining 163

6.3.12.3 Comparative Statement 163

6.3.13 Advantages of Data Mining 163

6.3.14 Drawbacks of Data Mining 165

6.3.15 Data Mining Applications 165

6.3.16 Methodology 167

6.3.17 Results 169

6.3.18 Conclusion and Future Scope 171

References 172

7 Predictive Analytics in IT Service Management (ITSM) 175
Sharon Christa I.L. and Suma V.

7.1 Introduction 176

7.2 Analytics: An Overview 178

7.2.1 Predictive Analytics 180

7.3 Significance of Predictive Analytics in ITSM 181

7.4 Ticket Analytics: A Case Study 186

7.4.1 Input Parameters 188

7.4.2 Predictive Modeling 188

7.4.3 Random Forest Model 189

7.4.4 Performance of the Predictive Model 191

7.5 Conclusion 191

References 192

8 Modified Cross-Sell Model for Telecom Service Providers Using Data Mining Techniques 195
K. Ramya Laxmi, Sumit Srivastava, K. Madhuravani, S. Pallavi and Omprakash Dewangan

8.1 Introduction 196

8.2 Literature Review 198

8.3 Methodology and Implementation 200

8.3.1 Selection of the Independent Variables 200

8.4 Data Partitioning 203

8.4.1 Interpreting the Results of Logistic Regression Model 203

8.5 Conclusions 204

References 205

9 Inductive Learning Including Decision Tree and Rule Induction Learning 209
Raj Kumar Patra, A. Mahendar and G. Madhukar

9.1 Introduction 210

9.2 The Inductive Learning Algorithm (ILA) 212

9.3 Proposed Algorithms 213

9.4 Divide & Conquer Algorithm 214

9.4.1 Decision Tree 214

9.5 Decision Tree Algorithms 215

9.5.1 ID3 Algorithm 215

9.5.2 Separate and Conquer Algorithm 217

9.5.3 RULE EXTRACTOR-1 226

9.5.4 Inductive Learning Applications 226

9.5.4.1 Education 226

9.5.4.2 Making Credit Decisions 227

9.5.5 Multidimensional Databases and OLAP 228

9.5.6 Fuzzy Choice Trees 228

9.5.7 Fuzzy Choice Tree Development From a Multidimensional Database 229

9.5.8 Execution and Results 230

9.6 Conclusion and Future Work 231

References 232

10 Data Mining for Cyber-Physical Systems 235
M. Varaprasad Rao, D. Anji Reddy, Anusha Ampavathi and Shaik Munawar

10.1 Introduction 236

10.1.1 Models of Cyber-Physical System 238

10.1.2 Statistical Model-Based Methodologies 239

10.1.3 Spatial-and-Transient Closeness-Based Methodologies 240

10.2 Feature Recovering Methodologies 240

10.3 CPS vs. IT Systems 241

10.4 Collections, Sources, and Generations of Big Data for CPS 242

10.4.1 Establishing Conscious Computation and Information Systems 243

10.5 Spatial Prediction 243

10.5.1 Global Optimization 244

10.5.2 Big Data Analysis CPS 245

10.5.3 Analysis of Cloud Data 245

10.5.4 Analysis of Multi-Cloud Data 247

10.6 Clustering of Big Data 248

10.7 NoSQL 251

10.8 Cyber Security and Privacy Big Data 251

10.8.1 Protection of Big Computing and Storage 252

10.8.2 Big Data Analytics Protection 252

10.8.3 Big Data CPS Applications 256

10.9 Smart Grids 256

10.10 Military Applications 258

10.11 City Management 259

10.12 Clinical Applications 261

10.13 Calamity Events 262

10.14 Data Streams Clustering by Sensors 263

10.15 The Flocking Model 263

10.16 Calculation Depiction 264

10.17 Initialization 265

10.18 Representative Maintenance and Clustering 266

10.19 Results 267

10.20 Conclusion 268

References 269

11 Developing Decision Making and Risk Mitigation: Using CRISP-Data Mining 281
Vivek Parganiha, Soorya Prakash Shukla and Lokesh Kumar Sharma

11.1 Introduction 282

11.2 Background 283

11.3 Methodology of CRISP-DM 284

11.4 Stage One - Determine Business Objectives 286

11.4.1 What Are the Ideal Yields of the Venture? 287

11.4.2 Evaluate the Current Circumstance 288

11.4.3 Realizes Data Mining Goals 289

11.5 Stage Two - Data Sympathetic 290

11.5.1 Portray Data 291

11.5.2 Investigate Facts 291

11.5.3 Confirm Data Quality 292

11.5.4 Data Excellence Description 292

11.6 Stage Three - Data Preparation 292

11.6.1 Select Your Data 294

11.6.2 The Data Is Processed 294

11.6.3 Data Needed to Build 294

11.6.4 Combine Information 295

11.7 Stage Four - Modeling 295

11.7.1 Select Displaying Strategy 296

11.7.2 Produce an Investigation Plan 297

11.7.3 Fabricate Ideal 297

11.7.4 Evaluation Model 297

11.8 Stage Five - Evaluation 298

11.8.1 Assess Your Outcomes 299

11.8.2 Survey Measure 299

11.8.3 Decide on the Subsequent Stages 300

11.9 Stage Six - Deployment 300

11.9.1 Plan Arrangement 301

11.9.2 Plan Observing and Support 301

11.9.3 Produce the Last Report 302

11.9.4 Audit Venture 302

11.10 Data on ERP Systems 302

11.11 Usage of CRISP-DM Methodology 304

11.12 Modeling 306

11.12.1 Association Rule Mining (ARM) or Association Analysis 307

11.12.2 Classification Algorithms 307

11.12.3 Regression Algorithms 308

11.12.4 Clustering Algorithms 308

11.13 Assessment 310

11.14 Distribution 310

11.15 Results and Discussion 310

11.16 Conclusion 311

References 314

12 Human-Machine Interaction and Visual Data Mining 317
Upasana Sinha, Akanksha Gupta, Samera Khan, Shilpa Rani and Swati Jain

12.1 Introduction 318

12.2 Related Researches 320

12.2.1 Data Mining 323

12.2.2 Data Visualization 323

12.2.3 Visual Learning 324

12.3 Visual Genes 325

12.4 Visual Hypotheses 326

12.5 Visual Strength and Conditioning 326

12.6 Visual Optimization 327

12.7 The Vis 09 Model 327

12.8 Graphic Monitoring and Contact With Human-Computer 328

12.9 Mining HCI Information Using Inductive Deduction Viewpoint 332

12.10 Visual Data Mining Methodology 334

12.11 Machine Learning Algorithms for Hand Gesture Recognition 338

12.12 Learning 338

12.13 Detection 339

12.14 Recognition 340

12.15 Proposed Methodology for Hand Gesture Recognition 340

12.16 Result 343

12.17 Conclusion 343

References 344

13 MSDTrA: A Boosting Based-Transfer Learning Approach for Class Imbalanced Skin Lesion Dataset for Melanoma Detection 349
Lokesh Singh, Rekh Ram Janghel and Satya Prakash Sahu

13.1 Introduction 349

13.2 Literature Survey 352

13.3 Methods and Material 353

13.3.1 Proposed Methodology: Multi Source Dynamic TrAdaBoost Algorithm 355

13.4 Experimental Results 357

13.5 Libraries Used 357

13.6 Comparing Algorithms Based on Decision Boundaries 357

13.7 Evaluating Results 358

13.8 Conclusion 361

References 361

14 New Algorithms and Technologies for Data Mining 365
Padma Bonde, Latika Pinjarkar, Korhan Cengiz, Aditi Shukla and Maguluri Sudeep Joel

14.1 Introduction 366

14.2 Machine Learning Algorithms 368

14.3 Supervised Learning 368

14.4 Unsupervised Learning 369

14.5 Semi-Supervised Learning 369

14.6 Regression Algorithms 371

14.7 Case-Based Algorithms 371

14.8 Regularization Algorithms 372

14.9 Decision Tree Algorithms 372

14.10 Bayesian Algorithms 373

14.11 Clustering Algorithms 374

14.12 Association Rule Learning Algorithms 375

14.13 Artificial Neural Network Algorithms 375

14.14 Deep Learning Algorithms 376

14.15 Dimensionality Reduction Algorithms 377

14.16 Ensemble Algorithms 377

14.17 Other Machine Learning Algorithms 378

14.18 Data Mining Assignments 378

14.19 Data Mining Models 381

14.20 Non-Parametric & Parametric Models 381

14.21 Flexible vs. Restrictive Methods 382

14.22 Unsupervised vs. Supervised Learning 382

14.23 Data Mining Methods 384

14.24 Proposed Algorithm 387

14.24.1 Organization Formation Procedure 387

14.25 The Regret of Learning Phase 388

14.26 Conclusion 392

References 392

15 Classification of EEG Signals for Detection of Epileptic Seizure Using Restricted Boltzmann Machine Classifier 397
Sudesh Kumar, Rekh Ram Janghel and Satya Prakash Sahu

15.1 Introduction 398

15.2 Related Work 400

15.3 Material and Methods 401

15.3.1 Dataset Description 401

15.3.2 Proposed Methodology 403

15.3.3 Normalization 404

15.3.4 Preprocessing Using PCA 404

15.3.5 Restricted Boltzmann Machine (RBM) 406

15.3.6 Stochastic Binary Units (Bernoulli Variables) 407

15.3.7 Training 408

15.3.7.1 Gibbs Sampling 409

15.3.7.2 Contrastive Divergence (CD) 409

15.4 Experimental Framework 410

15.5 Experimental Results and Discussion 412

15.5.1 Performance Measurement Criteria 412

15.5.2 Experimental Results 412

15.6 Discussion 414

15.7 Conclusion 418

References 419

16 An Enhanced Security of Women and Children Using Machine Learning and Data Mining Techniques 423
Nanda R. Wagh and Sanjay R. Sutar

16.1 Introduction 424

16.2 Related Work 424

16.2.1 WoSApp 424

16.2.2 Abhaya 425

16.2.3 Women Empowerment 425

16.2.4 Nirbhaya 425

16.2.5 Glympse 426

16.2.6 Fightback 426

16.2.7 Versatile-Based 426

16.2.8 RFID 426

16.2.9 Self-Preservation Framework for WomenBWith Area Following and SMS Alarming Through GSM Network 426

16.2.10 Safe: A Women Security Framework 427

16.2.11 Intelligent Safety System For Women Security 427

16.2.12 A Mobile-Based Women Safety Application 427

16.2.13 Self-Salvation - The Women’s Security Module 427

16.3 Issue and Solution 427

16.3.1 Inspiration 427

16.3.2 Issue Statement and Choice of Solution 428

16.4 Selection of Data 428

16.5 Pre-Preparation Data 430

16.5.1 Simulation 431

16.5.2 Assessment 431

16.5.3 Forecast 434

16.6 Application Development 436

16.6.1 Methodology 436

16.6.2 AI Model 437

16.6.3 Innovations Used The Proposed Application Has Utilized After Technologies 437

16.7 Use Case For The Application 437

16.7.1 Application Icon 437

16.7.2 Enlistment Form 438

16.7.3 Login Form 439

16.7.4 Misconduct Place Detector 439

16.7.5 Help Button 440

16.8 Conclusion 443

References 443

17 Conclusion and Future Direction in Data Mining and Machine Learning 447
Santosh R. Durugkar, Rohit Raja, Kapil Kumar Nagwanshi and Ramakant Chandrakar

17.1 Introduction 448

17.2 Machine Learning 451

17.2.1 Neural Network 452

17.2.2 Deep Learning 452

17.2.3 Three Activities for Object Recognition 453

17.3 Conclusion 457

References 457

Index 461

Authors

Rohit Raja Kapil Kumar Nagwanshi Sandeep Kumar K. Ramya Laxmi