Multicore and GPU programming An Integrated approach 2nd Edition by Gerassimos Barlas – Ebook PDF Instant Download/Delivery: 9780128141212 ,0128141212
Full download Multicore and GPU programming An Integrated approach 2nd Edition after payment
Product details:
ISBN 10: 0128141212
ISBN 13: 9780128141212
Author: Gerassimos Barlas
Multicore and GPU Programming: An Integrated Approach, Second Edition offers broad coverage of key parallel computing tools, essential for multi-core CPU programming and many-core “massively parallel” computing. Using threads, OpenMP, MPI, CUDA and other state-of-the-art tools, the book teaches the design and development of software capable of taking advantage of modern computing platforms that incorporate CPUs, GPUs and other accelerators.
Presenting material refined over more than two decades of teaching parallel computing, author Gerassimos Barlas minimizes the challenge of transitioning from sequential programming to mastering parallel platforms with multiple examples, extensive case studies, and full source code. By using this book, readers will better understand how to develop programs that run over distributed memory machines using MPI, create multi-threaded applications with either libraries or directives, write optimized applications that balance the workload between available computing resources, and profile and debug programs targeting parallel machines.
- Includes comprehensive coverage of all major multi-core and many-core programming tools and platforms, including threads, OpenMP, MPI, CUDA, OpenCL and Thrust
- Covers the most recent versions of the above at the time of publication
- Demonstrates parallel programming design patterns and examples of how different tools and paradigms can be integrated for superior performance
- Updates in the second edition include the use of the C++17 standard for all sample code, a new chapter on concurrent data structures, a new chapter on OpenCL, and the latest research on load balancing
- Includes downloadable source code, examples and instructor support materials on the book’s companion website
Multicore and GPU programming An Integrated approach 2nd Edition Table of contents:
Part 1: Introduction
Chapter 1: Introduction
1.1. The era of multicore machines
1.2. A taxonomy of parallel machines
1.3. A glimpse of influential computing machines
1.4. Performance metrics
1.5. Predicting and measuring parallel program performance
Exercises
Bibliography
Chapter 2: Multicore and parallel program design
2.1. Introduction
2.2. The PCAM methodology
2.3. Decomposition patterns
2.4. Program structure patterns
2.5. Matching decomposition patterns with program structure patterns
Exercises
Bibliography
Part 2: Programming with threads and processes
Chapter 3: Threads and concurrency in standard C++
3.1. Introduction
3.2. Threads
3.3. Thread creation and initialization
3.4. Sharing data between threads
3.5. Design concerns
3.6. Semaphores
3.7. Applying semaphores in classical problems
3.8. Atomic data types
3.9. Monitors
3.10. Applying monitors in classical problems
3.11. Asynchronous threads
3.12. Dynamic vs. static thread management
3.13. Threads and fibers
3.14. Debugging multi-threaded applications
Exercises
Bibliography
Chapter 4: Parallel data structures
4.1. Introduction
4.2. Lock-based structures
4.3. Lock-free structures
4.4. Closing remarks
Exercises
Bibliography
Chapter 5: Distributed memory programming
5.1. Introduction
5.2. MPI
5.3. Core concepts
5.4. Your first MPI program
5.5. Program architecture
5.6. Point-to-point communication
5.7. Alternative point-to-point communication modes
5.8. Non-blocking communications
5.9. Point-to-point communications: summary
5.10. Error reporting & handling
5.11. Collective communications
5.12. Persistent communications
5.13. Big-count communications in MPI 4.0
5.14. Partitioned communications
5.15. Communicating objects
5.16. Node management: communicators and groups
5.17. One-sided communication
5.18. I/O considerations
5.19. Combining MPI processes with threads
5.20. Timing and performance measurements
5.21. Debugging, profiling, and tracing MPI programs
5.22. The Boost.MPI library
5.23. A case study: diffusion-limited aggregation
5.24. A case study: brute-force encryption cracking
5.25. A case study: MPI implementation of the master–worker pattern
Exercises
Bibliography
Chapter 6: GPU programming: CUDA
6.1. Introduction
6.2. CUDA’s programming model: threads, blocks, and grids
6.3. CUDA’s execution model: streaming multiprocessors and warps
6.4. CUDA compilation process
6.5. Putting together a CUDA project
6.6. Memory hierarchy
6.7. Optimization techniques
6.8. Graphs
6.9. Warp functions
6.10. Cooperative groups
6.11. Dynamic parallelism
6.12. Debugging CUDA programs
6.13. Profiling CUDA programs
6.14. CUDA and MPI
6.15. Case studies
Exercises
Bibliography
Chapter 7: GPU and accelerator programming: OpenCL
7.1. The OpenCL architecture
7.2. The platform model
7.3. The execution model
7.4. The programming model
7.5. The memory model
7.6. Shared virtual memory
7.7. Atomics and synchronization
7.8. Work group functions
7.9. Events and profiling OpenCL programs
7.10. OpenCL and other parallel software platforms
7.11. Case study: Mandelbrot set
Exercises
Part 3: Higher-level parallel programming
Chapter 8: Shared-memory programming: OpenMP
8.1. Introduction
8.2. Your first OpenMP program
8.3. Variable scope
8.4. Loop-level parallelism
8.5. Task parallelism
8.6. Synchronization constructs
8.7. Cancellation constructs
8.8. SIMD extensions
8.9. Offloading to devices
8.10. The loop construct
8.11. Thread affinity
8.12. Correctness and optimization issues
8.13. A case study: sorting in OpenMP
8.14. A case study: brute-force encryption cracking, combining MPI and OpenMP
Exercises
Bibliography
Chapter 9: High-level multi-threaded programming with the Qt library
9.1. Introduction
9.2. Implicit thread creation
9.3. Qt’s pool of threads
9.4. Higher-level constructs – multi-threaded programming without threads!
Exercises
Bibliography
Chapter 10: The Thrust template library
10.1. Introduction
10.2. First steps in Thrust
10.3. Working with Thrust datatypes
10.4. Thrust algorithms
10.5. Fancy iterators
10.6. Switching device back-ends
10.7. Thrust execution policies and asynchronous execution
10.8. Case studies
Exercises
Bibliography
Part 4: Advanced topics
Chapter 11: Load balancing
11.1. Introduction
11.2. Dynamic load balancing: the Linda legacy
11.3. Static load balancing: the divisible load theory approach
11.4. DLTLib: a library for partitioning workloads
11.5. Case studies
Exercises
Bibliography
Appendix A: Creating Qt programs
A.1. Using an IDE
A.2. The qmake utility
Appendix B: Running MPI programs: preparatory and configuration steps
B.1. Preparatory steps
B.2. Computing nodes discovery for MPI program deployment
Appendix C: Time measurement
C.1. Introduction
C.2. POSIX high-resolution timing
C.3. Timing in C++11
C.4. Timing in Qt
C.5. Timing in OpenMP
C.6. Timing in MPI
C.7. Timing in CUDA
Appendix D: Boost.MPI
D.1. Mapping from MPI C to Boost.MPI
Appendix E: Setting up CUDA
E.1. Installation
E.2. Issues with GCC
E.3. Combining CUDA with third-party libraries
Appendix F: OpenCL helper functions
F.1. Function readCLFromFile
F.2. Function isError
F.3. Function getCompilationError
F.4. Function handleError
F.5. Function setupDevice
F.6. Function setupProgramAndKernel
Appendix G: DLTlib
G.1. DLTlib functions
G.2. DLTlib files
Bibliography
Glossary
Bibliography
Bibliography
Index
People also search for Multicore and GPU programming An Integrated approach 2nd Edition:
multicore and gpu programming an integrated approach 2nd edition
how many cores does a gpu have
is gpu important for programming
gpu programming examples
how many cores do gpu have
Tags: Gerassimos Barlas, Multicore, GPU programming, Integrated approach