+353-1-416-8900REST OF WORLD
+44-20-3973-8888REST OF WORLD
1-917-300-0470EAST COAST U.S
1-800-526-8630U.S. (TOLL FREE)

Improving Product Reliability and Software Quality. Strategies, Tools, Process and Implementation. Edition No. 2. Quality and Reliability Engineering Series

  • Book

  • 456 Pages
  • April 2019
  • John Wiley and Sons Ltd
  • ID: 5227957

The authoritative guide to the effective design and production of reliable technology products, revised and updated

While most manufacturers have mastered the process of producing quality products, product reliability, software quality and software security has lagged behind. The revised second edition of Improving Product Reliability and Software Quality offers a comprehensive and detailed guide to implementing a hardware reliability and software quality process for technology products. The authors - noted experts in the field - provide useful tools, forms and spreadsheets for executing an effective product reliability and software quality development process and explore proven software quality and product reliability concepts.

The authors discuss why so many companies fail after attempting to implement or improve their product reliability and software quality program. They outline the critical steps for implementing a successful program. Success hinges on establishing a reliability lab, hiring the right people and implementing a reliability and software quality process that does the right things well and works well together. Designed to be accessible, the book contains a decision matrix for small, medium and large companies. Throughout the book, the authors describe the hardware reliability and software quality process as well as the tools and techniques needed for putting it in place. The concepts, ideas and material presented are appropriate for any organization. This updated second edition: 

  • Contains new chapters on Software tools, Software quality process and software security.
  • Expands the FMEA section to include software fault trees and software FMEAs.
  • Includes two new reliability tools to accelerate design maturity and reduce the risk of premature wearout.
  • Contains new material on preventative maintenance, predictive maintenance and Prognostics and Health Management (PHM) to better manage repair cost and unscheduled downtime.
  • Presents updated information on reliability modeling and hiring reliability and software engineers.
  • Includes a comprehensive review of the reliability process from a multi-disciplinary viewpoint including new material on uprating and counterfeit components.
  • Discusses aspects of competition, key quality and reliability concepts and presents the tools for implementation.

Written for engineers, managers and consultants lacking a background in product reliability and software quality theory and statistics, the updated second edition of Improving Product Reliability and Software Quality explores all phases of the product life cycle. 

Table of Contents

About the Authors xix

List of Figures xxi

List of Tables xxv

Series Editor's Foreword xxvii

Series Foreword Second Edition xxix

Series Foreword First Edition xxxi

Foreword First Edition xxxiii

Preface Second Edition xxxv

Preface First Edition xxxvii

Acknowledgments xli

Glossary xliii

Part I Reliability and Software Quality - It’s a Matter of Survival 1

1 The Need for a New Paradigm for Hardware Reliability and Software Quality 3

1.1 Rapidly Shifting Challenges for Hardware Reliability and Software Quality 3

1.2 Gaining Competitive Advantage 5

1.3 Competing in the Next Decade -Winners Will Compete on Reliability 5

1.4 Concurrent Engineering 6

1.5 Reducing the Number of Engineering Change Orders at Product Release 8

1.6 Time-to-Market Advantage 9

1.7 Accelerating Product Development 10

1.8 Identifying and Managing Risks 11

1.9 ICM, a Process to Mitigate Risk 11

1.10 Software Quality Overview 12

References 13

Further Reading 13

2 Barriers to Implementing Hardware Reliability and Software Quality 15

2.1 Lack of Understanding 15

2.2 Internal Barriers 16

2.3 Implementing Change and Change Agents 17

2.4 Building Credibility 19

2.5 Perceived External Barriers 20

2.6 Time to Gain Acceptance 21

2.7 External Barrier 22

2.8 Barriers to Software Process Improvement 23

3 Understanding Why Products Fail 25

3.1 Why Things Fail 25

3.2 Parts Have Improved, Everyone Can Build Quality Products 28

3.3 Hardware Reliability and Software Quality -The New Paradigm 28

3.4 Reliability vs. Quality Escapes 29

3.5 Why Software Quality Improvement Programs Are Unsuccessful 30

Further Reading 31

4 Alternative Approaches to Implementing Reliability 33

4.1 Hiring Consultants for HALT Testing 33

4.2 Outsourcing Reliability Testing 33

4.3 Using Consultants to Develop and Implement a Reliability Program 34

4.4 Hiring Reliability Engineers 34

Part II Unraveling the Mystery 37

5 The Product Life Cycle 39

5.1 Six Phases of the Product Life Cycle 39

5.2 Risk Mitigation 41

5.3 The ICM Process for a Small Company 45

5.4 Design Guidelines 46

5.5 Warranty 46

Further Reading 47

Reliability Process 47

DFM 48

6 Reliability Concepts 49

6.1 The Bathtub Curve 50

6.2 Mean Time between Failure 51

6.3 Warranty Costs 53

6.4 Availability 55

6.5 Reliability Growth 57

6.6 Reliability Demonstration Testing 59

6.7 Maintenance and Availability 62

6.8 Component Derating 69

6.9 Component Uprating 70

Reference 71

Further Reading 72

Reliability Growth 72

Reliability Demonstration 72

Prognostics and Health Management 72

7 FMEA 73

7.1 Benefits of FMEA 73

7.2 Components of FMEA 74

7.3 Preparing for the FMEA 86

7.4 Barriers to the FMEA Process 89

7.5 FMEA Ground Rules 91

7.6 Using Macros to Improve FMEA Efficiency and Effectiveness 92

7.7 Software FMEA 94

7.8 Software Fault Tree Analysis (SFTA) 97

7.9 Process FMEAs 97

7.10 FMMEA 99

8 The Reliability Toolbox 101

8.1 The HALT Process 101

8.2 Highly Accelerated Stress Screening (HASS) 121

8.3 HALT and HASS Test Chambers 127

8.4 Accelerated Reliability Growth (ARG) 128

8.5 Accelerated Early Life Test (ELT) 131

8.6 SPC Tool 132

8.7 FIFO Tool 132

References 134

Further Reading 134

FMEA 134

HALT 135

HASS 136

Quality 136

Burn-in 136

ESS 137

Up Rating 137

9 Software Quality Goals and Metrics 139

9.1 Setting Software Quality Goals 139

9.2 Software Metrics 140

9.3 Lines of Code (LOC) 142

9.4 Defect Density 142

9.5 Defect Models 144

9.6 Defect Run Chart 145

9.7 Escaped Defect Rate 147

9.8 Code Coverage 148

References 149

Further Reading 150

10 Software Quality Analysis Techniques 151

10.1 Root Cause Analysis 151

10.2 The 5 Whys 151

10.3 Cause and Effect Diagrams 152

10.4 Pareto Charts 153

10.5 Defect Prevention, Defect Detection, and Defensive Programming 154

10.6 Effort Estimation 157

Reference 158

Further Reading 158

11 Software Life Cycles 159

11.1 Waterfall 159

11.2 Agile 161

11.3 CMMI 162

11.4 How to Choose a Software Life Cycle 165

Reference 166

Further Reading 166

12 Software Procedures and Techniques 167

12.1 Gathering Requirements 167

12.2 Documenting Requirements 169

12.3 Documentation 172

12.4 Code Comments 173

12.5 Reviews and Inspections 174

12.6 Traceability 179

12.7 Defect Tracking 179

12.8 Software and Hardware Integration 180

References 182

Further Reading 182

13 Why Hardware Reliability and Software Quality Improvement Efforts Fail 183

13.1 Lack of Commitment to the Reliability Process 183

13.2 Inability to Embrace and Mitigate Technologies Risk Issues 185

13.3 Choosing the Wrong People for the Job 186

13.4 Inadequate Funding 186

13.5 Inadequate Resources 191

13.6 MIL-HDBK 217 -Why It Is Obsolete 192

13.7 Finding But Not Fixing Problems 195

13.8 Nondynamic Testing 196

13.9 Vibration Testing Too Difficult to Implement 196

13.10 The Impact of Late Hardware or Late Software Delivery 196

13.11 Supplier Reliability 196

Reference 197

Further Reading 197

14 Supplier Management 199

14.1 Purchasing Interface 199

14.2 Identifying Your Critical Suppliers 200

14.3 Develop a Thorough Supplier Audit Process 200

14.4 Develop Rapid Nonconformance Feedback 201

14.5 Develop a Materials Review Board (MRB) 202

14.6 Counterfeit Parts and Materials 202

Part III Steps to Successful Implementation 205

15 Establishing a Reliability Lab 207

15.1 Staffing for Reliability 207

15.2 The Reliability Lab 208

15.3 Facility Requirements 210

15.4 Liquid Nitrogen Requirements 210

15.5 Air Compressor Requirements 211

15.6 Selecting a Reliability Lab Location 212

15.7 Selecting a Halt Test Chamber 213

Reference 220

16 Hiring and Staffing the Right People 221

16.1 Staffing for Reliability 221

16.2 Staffing for Software Engineers 225

16.3 Choosing the Wrong People for the Job 226

17 Implementing the Reliability Process 229

17.1 Reliability Is Everyone’s Job 229

17.2 Formalizing the Reliability Process 230

17.3 Implementing the Reliability Process 231

17.4 Rolling Out the Reliability Process 231

17.5 Developing a Reliability Culture 235

17.6 Setting Reliability Goals 236

17.7 Training 237

17.8 Product Life Cycle Defined 238

17.9 Proactive and Reactive Reliability Activities 241

Further Reading 244

Reliability Process 244

Part IV Reliability and Quality Process for Product Development 245

18 Product Concept Phase 247

18.1 Reliability Activities in the Product Concept Phase 247

18.2 Establish the Reliability Organization 248

18.3 Define the Reliability Process 249

18.4 Define the Product Reliability Requirements 249

18.5 Capture and Apply Lessons Learned 249

18.6 Mitigate Risk 252

19 Design Concept Phase 257

19.1 Reliability Activities in the Design Concept Phase 257

19.2 Set Reliability Requirements and Budgets 259

19.3 Define Reliability Design Guidelines 263

19.4 Revise Risk Mitigation 264

19.5 Schedule Reliability Activities and Capital Budgets 268

19.6 Decide Risk Mitigation Sign-off Day 269

19.7 Reflect on What Worked Well 271

20 Product Design Phase 273

20.1 Product Design Phase 273

20.2 Reliability Estimates 274

20.3 Implementing Risk Mitigation Plans 276

20.4 Design for Reliability Guidelines (DFR) 285

20.5 Design FMEA 289

20.6 Installing a Failure Reporting Analysis and Corrective Action System 290

20.7 HALT Planning 291

20.8 HALT Test Development 292

20.9 Risk Mitigation Meeting 295

Further Reading 296

FMEA 296

HALT 296

21 Design Validation Phase 299

21.1 Design Validation 299

21.2 Using HALT to Precipitate Failures 301

21.3 Proof of Screen (POS) 313

21.4 Highly Accelerated Stress Screen (HASS) 315

21.5 Operate FRACAS 315

21.6 Design FMEA 317

21.7 Closure of Risk Issues 317

Further Reading 318

FMEA 318

Acceleration Methods 318

ESS 318

HALT 319

22 Software Testing and Debugging 321

22.1 Unit Tests 321

22.2 Integration Tests 323

22.3 System Tests 324

22.4 Regression Tests 324

22.5 Security Tests 326

22.6 Guidelines for Creating Test Cases 327

22.7 Test Plans 328

22.8 Defect Isolation Techniques 329

22.9 Instrumentation and Logging 331

Further Reading 334

23 Applying Software Quality Procedures 335

23.1 Using Defect Model to Create Defect Run Chart 336

23.2 Using Defect Run Chart to Know When You Have Achieved the Quality Target 336

23.3 Using Root Cause Analysis on Defects to Improve Organizational Quality Delivery 338

23.4 Continuous Integration and Test 338

Further Reading 339

24 Production Phase 341

24.1 Accelerating Design Maturity 341

24.2 Reliability Growth 346

24.3 Design and Process FMEA 351

Further Reading 355

FMEA 355

Quality 356

Reliability Growth 356

Burn-In 357

HASS 357

25 End-of-Life Phase 359

25.1 Managing Obsolescence 359

25.2 Product Termination 360

25.3 Project Assessment 360

Further Reading 361

26 Field Service 363

26.1 Design for Ease of Access 363

26.2 Identify High Replacement Assemblies (FRUs) 363

26.3 Wearout Replacement 365

26.4 Preemptive Servicing 365

26.5 Servicing Tools 365

26.6 Service Loops 366

26.7 Availability or Repair Time Turnaround 367

26.8 Avoid System Failure Through Redundancy 367

26.9 Random versus Wearout Failures 367

Further Reading 368

Appendix A 369

A.1 Reliability Consultants 369

A.2 Graduate Reliability Engineering Programs and Reliability Certification Programs 372

A.3 Reliability Professional Organizations and Societies 376

A.4 Reliability Training Classes 377

A.5 Environmental Testing Services 379

A.6 HALT Test Chambers 381

A.7 Reliability Websites 382

A.8 Reliability Software 383

A.9 Reliability Seminars and Conferences 384

A.10 Reliability Journals 386

Appendix B 387

B.1 MTBF, FIT, and PPM Conversions 387

B.2 Mean Time Between Failure (MTBF) 387

B.3 Estimating Field Failures 396

B.3.1 Comparing Repairable to Nonrepairable Systems 397

Index 399

Authors

Mark A. Levin Teradyne, Inc., California, USA. Ted T. Kalal Teradyne, Inc., California, USA. Jonathan Rodin