Programming Massively Parallel Processors. A Hands-on Approach. Edition No. 2

Programming Massively Parallel Processors: A Hands-on Approach, Second Edition, teaches students how to program massively parallel processors. It offers a detailed discussion of various techniques for constructing parallel programs. Case studies are used to demonstrate the development process, which begins with computational thinking and ends with effective and efficient parallel programs.

This guide shows both student and professional alike the basic concepts of parallel programming and GPU architecture. Topics of performance, floating-point format, parallel patterns, and dynamic parallelism are covered in depth. This revised edition contains more parallel programming examples, commonly-used libraries such as Thrust, and explanations of the latest tools. It also provides new coverage of CUDA 5.0, improved performance, enhanced development tools, increased hardware support, and more; increased coverage of related technology, OpenCL and new material on algorithm patterns, GPU clusters, host programming, and data parallelism; and two new case studies (on MRI reconstruction and molecular visualization) that explore the latest applications of CUDA and GPUs for scientific research and high-performance computing.

This book should be a valuable resource for advanced students, software engineers, programmers, and hardware engineers.

1 Introduction2 History of GPU Computing3 Introduction to Data Parallelism and CUDA C4 Data-Parallel Execution Model5 CUDA Memories6 Performance Considerations7 Floating-Point Considerations8 Parallel Patterns: Convolutions9 Parallel Patterns: Prefix Sum10 Parallel Patterns: Sparse Matrix-Vector Multiplication11 Application Case Study: Advanced MRI Reconstruction12 Application Case Study: Molecular Visualization and Analysis13 Parallel Programming and Computational Thinking14 An Introduction to OpenCL15 Parallel Programming with OpenACC16 Thrust: A Productivity-Oriented Library for CUDA17 CUDA FORTRAN18 An Introduction to C++ AMP19 Programming a Heterogeneous Computing Cluster20 CUDA Dynamic Parallelism21 Conclusions and Future Outlook

Appendix A: Matrix Multiplication Host-Only Version Source CodeAppendix B: GPU Compute Capabilities

Authors

David B. Kirk NVIDIA Fellow.

David B. Kirk is well recognized for his contributions to graphics hardware and algorithm research. By the time he began his studies at Caltech, he had already earned B.S. and M.S. degrees in mechanical engineering from MIT and worked as an engineer for Raster Technologies and Hewlett-Packard's Apollo Systems Division, and after receiving his doctorate, he joined Crystal Dynamics, a video-game manufacturing company, as chief scientist and head of technology. In 1997, he took the position of Chief Scientist at NVIDIA, a leader in visual computing technologies, and he is currently an NVIDIA Fellow.

At NVIDIA, Kirk led graphics-technology development for some of today's most popular consumer-entertainment platforms, playing a key role in providing mass-market graphics capabilities previously available only on workstations costing hundreds of thousands of dollars. For his role in bringing high-performance graphics to personal computers, Kirk received the 2002 Computer Graphics Achievement Award from the Association for Computing Machinery and the Special Interest Group on Graphics and Interactive Technology (ACM SIGGRAPH) and, in 2006, was elected to the National Academy of Engineering, one of the highest professional distinctions for engineers.

Kirk holds 50 patents and patent applications relating to graphics design and has published more than 50 articles on graphics technology, won several best-paper awards, and edited the book Graphics Gems III. A technological "evangelist" who cares deeply about education, he has supported new curriculum initiatives at Caltech and has been a frequent university lecturer and conference keynote speaker worldwide.
Wen-mei W. Hwu CTO, MulticoreWare and professor specializing in compiler design, computer architecture, microarchitecture, and parallel processing, University of Illinois at Urbana-Champaign, USA.

Wen-mei W. Hwu is a Professor and holds the Sanders-AMD Endowed Chair in the Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign. His research interests are in the area of architecture, implementation, compilation, and algorithms for parallel computing. He is the chief scientist of Parallel Computing Institute and director of the IMPACT research group (www.impact.crhc.illinois.edu). He is a co-founder and CTO of MulticoreWare. For his contributions in research and teaching, he received the ACM SigArch Maurice Wilkes Award, the ACM Grace Murray Hopper Award, the Tau Beta Pi Daniel C. Drucker Eminent Faculty Award, the ISCA Influential Paper Award, the IEEE Computer Society B. R. Rau Award and the Distinguished Alumni Award in Computer Science of the University of California, Berkeley. He is a fellow of IEEE and ACM. He directs the UIUC CUDA Center of Excellence and serves as one of the principal investigators of the NSF Blue Waters Petascale computer project. Dr. Hwu received his Ph.D. degree in Computer Science from the University of California, Berkeley.

Table of Contents

Authors

Related Topics

Related Products

Programming Language Market Report 2025

Artificial Intelligence Application Programming Interface (AI API) Global Market Report 2025

AI Programming Tools Market - Global Forecast 2026-2032

Game Programming Services Market - Global Forecast 2026-2032

Technology Landscape, Trends and Opportunities in Children's Hardware Programming Education Market