"Thorough coverage of a difficult topic... excellent explanations of concepts."

*OpenCL in Action* is a thorough, hands-on presentation of OpenCL, with an eye toward showing developers how to build high-performance applications of their own. It begins by presenting the core concepts behind OpenCL, including vector computing, parallel programming, and multi-threaded operations, and then guides you step-by-step from simple data structures to complex functions.

## preface

## acknowledgments

## about this book

# Part 1 Foundations of OpenCL programming

## 1. Introducing OpenCL

### 1.1. The dawn of OpenCL

### 1.2. Why OpenCL?

### 1.3. Analogy: OpenCL processing and a game of cards

### 1.4. A first look at an OpenCL application

### 1.5. The OpenCL standard and extensions

### 1.6. Frameworks and software development kits (SDKs)

### 1.7. Summary

## 2. Host programming: fundamental data structures

### 2.1. Primitive data types

### 2.2. Accessing platforms

### 2.3. Accessing installed devices

### 2.4. Managing devices with contexts

### 2.5. Storing device code in programs

### 2.6. Packaging functions in kernels

### 2.7. Collecting kernels in a command queue

### 2.8. Summary

## 3. Host programming: data transfer and partitioning

### 3.1. Setting kernel arguments

### 3.2. Buffer objects

### 3.3. Image objects

### 3.4. Obtaining information about buffer objects

### 3.5. Memory object transfer commands

### 3.6. Data partitioning

### 3.7. Summary

## 4. Kernel programming: data types and device memory

### 4.1. Introducing kernel coding

### 4.2. Scalar data types

### 4.3. Floating-point computing

### 4.4. Vector data types

### 4.5. The OpenCL device model

### 4.6. Local and private kernel arguments

### 4.7. Summary

## 5. Kernel programming: operators and functions

### 5.1. Operators

### 5.2. Work-item and work-group functions

### 5.3. Data transfer operations

### 5.4. Floating-point functions

### 5.5. Integer functions

### 5.6. Shuffle and select functions

### 5.7. Vector test functions

### 5.8. Geometric functions

### 5.9. Summary

## 6. Image processing

### 6.1. Image objects and samplers

### 6.2. Image processing functions

### 6.3. Image scaling and interpolation

### 6.4. Summary

## 7. Events, profiling, and synchronization

### 7.1. Host notification events

### 7.2. Command synchronization events

### 7.3. Profiling events

### 7.4. Work-item synchronization

### 7.5. Summary

## 8. Development with C++

### 8.1. Preliminary concerns

### 8.2. Creating kernels

### 8.3. Kernel arguments and memory objects

### 8.4. Command queues

### 8.5. Event processing

### 8.6. Summary

## 9. Development with Java and Python

### 9.1. Aparapi

### 9.2. JavaCL

### 9.3. PyOpenCL

### 9.4. Summary

## 10. General coding principles

### 10.1. Global size and local size

### 10.2. Numerical reduction

### 10.3. Synchronizing work-groups

### 10.4. Ten tips for high-performance kernels

### 10.5. Summary

# Part 2 Coding practical algorithms in OpenCL

## 11. Reduction and sorting

### 11.1. MapReduce

### 11.2. The bitonic sort

### 11.3. The radix sort

### 11.4. Summary

## 12. Matrices and QR decomposition

### 12.1. Matrix transposition

### 12.2. Matrix multiplication

### 12.3. The Householder transformation

### 12.4. The QR decomposition

### 12.5. Summary

## 13. Sparse matrices

### 13.1. Differential equations and sparse matrices

### 13.2. Sparse matrix storage and the Harwell-Boeing collection

### 13.3. The method of steepest descent

### 13.4. The conjugate gradient method

### 13.5. Summary

## 14. Signal processing and the fast Fourier transform

### 14.1. Introducing frequency analysis

### 14.2. The discrete Fourier transform

### 14.3. The fast Fourier transform

### 14.4. Summary

# Part 3 Accelerating OpenGL with OpenCL

## 15. Combining OpenCL and OpenGL

### 15.1. Sharing data between OpenGL and OpenCL

### 15.2. Obtaining information

### 15.3. Basic interoperability example

### 15.4. Interoperability and animation

### 15.5. Summary

## 16. Textures and renderbuffers

### 16.1. Image filtering

### 16.2. Filtering textures with OpenCL

### 16.3. Summary 349

## Appendix A: Installing and using a software development kit

## Appendix B: Real-time rendering with OpenGL

## Appendix C: The minimalist GNU for Windows and OpenCL

## Appendix D: OpenCL on mobile devices

## index

© 2014 Manning Publications Co.

## About the Technology

Whatever system you have, it probably has more raw processing power than you're using. OpenCL is a high-performance programming language that maximizes computational power by executing on CPUs, graphics processors, and other number-crunching devices. It's perfect for speed-sensitive tasks like vector computing, matrix operations, and graphics acceleration.

## About the book

*OpenCL in Action* blends the theory of parallel computing with the practical reality of building high-performance applications using OpenCL. It first guides you through the fundamental data structures in an intuitive manner. Then, it explains techniques for high-speed sorting, image processing, matrix operations, and fast Fourier transform. The book concludes with a deep look at the all-important subject of graphics acceleration. Numerous challenging examples give you different ways to experiment with working code.

A background in C or C++ is helpful, but no prior exposure to OpenCL is needed.

**FREE domestic shipping** on three or more pBooks

"Well-researched and a good read. It's difficult to find this information elsewhere."

"Lucis explanation of OpenCL with many well-chosen applications."

"Clearly the best OpenCL reference and hands-on guide out there, packed with amazing real-world examples."