contents


preface
acknowledgments
about this book

Part 1 Foundations of OpenCL programming

1 Introducing OpenCL
1.1 The dawn of OpenCL
1.2 Why OpenCL?
1.3 Analogy: OpenCL processing and a game of cards
1.4 A first look at an OpenCL application
1.5 The OpenCL standard and extensions
1.6 Frameworks and software development kits (SDKs)
1.7 Summary
2 Host programming: fundamental data structures
2.1 Primitive data types
2.2 Accessing platforms
2.3 Accessing installed devices
2.4 Managing devices with contexts
2.5 Storing device code in programs
2.6 Packaging functions in kernels
2.7 Collecting kernels in a command queue
2.8 Summary
3 Host programming: data transfer and partitioning
3.1 Setting kernel arguments
3.2 Buffer objects
3.3 Image objects
3.4 Obtaining information about buffer objects
3.5 Memory object transfer commands
3.6 Data partitioning
3.7 Summary
4 Kernel programming: data types and device memory
4.1 Introducing kernel coding
4.2 Scalar data types
4.3 Floating-point computing
4.4 Vector data types
4.5 The OpenCL device model
4.6 Local and private kernel arguments
4.7 Summary
5 Kernel programming: operators and functions
5.1 Operators
5.2 Work-item and work-group functions
5.3 Data transfer operations
5.4 Floating-point functions
5.5 Integer functions
5.6 Shuffle and select functions
5.7 Vector test functions
5.8 Geometric functions
5.9 Summary
6 Image processing
6.1 Image objects and samplers
6.2 Image processing functions
6.3 Image scaling and interpolation
6.4 Summary
7 Events, profiling, and synchronization
7.1 Host notification events
7.2 Command synchronization events
7.3 Profiling events
7.4 Work-item synchronization
7.5 Summary
8 Development with C++
8.1 Preliminary concerns
8.2 Creating kernels
8.3 Kernel arguments and memory objects
8.4 Command queues
8.5 Event processing
8.6 Summary
9 Development with Java and Python
9.1 Aparapi
9.2 JavaCL
9.3 PyOpenCL
9.4 Summary
10 General coding principles
10.1 Global size and local size
10.2 Numerical reduction
10.3 Synchronizing work-groups
10.4 Ten tips for high-performance kernels
10.5 Summary

Part 2 Coding practical algorithms in OpenCL

11 Reduction and sorting
11.1 MapReduce
11.2 The bitonic sort
11.3 The radix sort
11.4 Summary
12 Matrices and QR decomposition
12.1 Matrix transposition
12.2 Matrix multiplication
12.3 The Householder transformation
12.4 The QR decomposition
12.5 Summary
13 Sparse matrices
13.1 Differential equations and sparse matrices
13.2 Sparse matrix storage and the Harwell-Boeing collection
13.3 The method of steepest descent
13.4 The conjugate gradient method
13.5 Summary
14 Signal processing and the fast Fourier transform
14.1 Introducing frequency analysis
14.2 The discrete Fourier transform
14.3 The fast Fourier transform
14.4 Summary

Part 3 Accelerating OpenGL with OpenCL

15 Combining OpenCL and OpenGL
15.1 Sharing data between OpenGL and OpenCL
15.2 Obtaining information
15.3 Basic interoperability example
15.4 Interoperability and animation
15.5 Summary
16 Textures and renderbuffers
16.1 Image filtering
16.2 Filtering textures with OpenCL
16.3 Summary 349

 
appendix A Installing and using a software development kit
appendix B Real-time rendering with OpenGL
appendix C The minimalist GNU for Windows and OpenCL
appendix D OpenCL on mobile devices
index