The CUDA Handbook: A Comprehensive Guide to GPU Programming Front Cover

The CUDA Handbook: A Comprehensive Guide to GPU Programming

Description

The CUDA Handbook begins where CUDA by Example (Addison-Wesley, 2011) leaves off, discussing CUDA hardware and software in greater detail and covering both CUDA 5.0 and Kepler. Every CUDA developer, from the casual to the most sophisticated, will find something here of interest and immediate usefulness. Newer CUDA developers will see how the hardware processes commands and how the driver checks progress; more experienced CUDA developers will appreciate the expert coverage of topics such as the driver API and context migration, as well as the guidance on how best to structure CPU/GPU data interchange and synchronization.

The accompanying open source code–more than 25,000 lines of it, freely available at www.cudahandbook.com–is specifically intended to be reused and repurposed by developers.

Designed to be both a comprehensive reference and a practical cookbook, the text is divided into the following three parts:

Part I, Overview, gives high-level descriptions of the hardware and software that make CUDA possible.

Part II, Details, provides thorough descriptions of every aspect of CUDA, including

  •  Memory
  • Streams and events
  •  Models of execution, including the dynamic parallelism feature, new with CUDA 5.0 and SM 3.5
  • The streaming multiprocessors, including descriptions of all features through SM 3.5
  • Programming multiple GPUs
  • Texturing

The source code accompanying Part II is presented as reusable microbenchmarks and microdemos, designed to expose specific hardware characteristics or highlight specific use cases.

Part III, Select Applications, details specific families of CUDA applications and key parallel algorithms, including

  •  Streaming workloads
  • Reduction
  • Parallel prefix sum (Scan)
  • N-body
  • Image Processing

These algorithms cover the full range of potential CUDA applications.

Table of Contents

Chapter 1: Background
Chapter 2: Hardware Architecture
Chapter 3: Software Architecture
Chapter 4: Software Environment
Chapter 5: Memory
Chapter 6: Streams and Events
Chapter 7: Kernel Execution
Chapter 8: Streaming Multiprocessors
Chapter 9: Multiple GPUs
Chapter 10: Texturing
Chapter 11: Streaming Workloads
Chapter 12: Reduction
Chapter 13: Scan
Chapter 14: N-Body
Chapter 15: Image Processing: Normalized Correlation

Appendix A: The CUDA Handbook Library

To access the link, solve the captcha.