Multicore and GPU Programming An Integrated Approach 2nd Edition by Gerassimos Barlas – Ebook PDF Instant Download/Delivery: 9780128141205 ,0128141204
Full download Multicore and GPU Programming An Integrated Approach 2nd Edition after payment
Product details:
ISBN 10: 0128141204
ISBN 13: 9780128141205
Author: Gerassimos Barlas
Multicore and GPU Programming: An Integrated Approach, Second Edition offers broad coverage of key parallel computing tools, essential for multi-core CPU programming and many-core “massively parallel” computing. Using threads, OpenMP, MPI, CUDA and other state-of-the-art tools, the book teaches the design and development of software capable of taking advantage of modern computing platforms that incorporate CPUs, GPUs and other accelerators.
Presenting material refined over more than two decades of teaching parallel computing, author Gerassimos Barlas minimizes the challenge of transitioning from sequential programming to mastering parallel platforms with multiple examples, extensive case studies, and full source code. By using this book, readers will better understand how to develop programs that run over distributed memory machines using MPI, create multi-threaded applications with either libraries or directives, write optimized applications that balance the workload between available computing resources, and profile and debug programs targeting parallel machines.
- Includes comprehensive coverage of all major multi-core and many-core programming tools and platforms, including threads, OpenMP, MPI, CUDA, OpenCL and Thrust
- Covers the most recent versions of the above at the time of publication
- Demonstrates parallel programming design patterns and examples of how different tools and paradigms can be integrated for superior performance
- Updates in the second edition include the use of the C++17 standard for all sample code, a new chapter on concurrent data structures, a new chapter on OpenCL, and the latest research on load balancing
- Includes downloadable source code, examples and instructor support materials on the book’s companion website
Multicore and GPU Programming An Integrated Approach 2nd Edition Table of contents:
Part 1: Introduction
Chapter 1. Introduction
Abstract
1.1 The era of multicore machines
1.2 A taxonomy of parallel machines
1.3 A glimpse of influential computing machines
1.4 Performance metrics
1.5 Predicting and measuring parallel program performance
Exercises
Bibliography
Chapter 2. Multicore and parallel program design
Abstract
2.1 Introduction
2.2 The PCAM methodology
2.3 Decomposition patterns
2.4 Program structure patterns
2.5 Matching decomposition patterns with program structure patterns
Exercises
Bibliography
Part 2: Programming with threads and processes
Chapter 3. Threads and concurrency in standard C++
Abstract
3.1 Introduction
3.2 Threads
3.3 Thread creation and initialization
3.4 Sharing data between threads
3.5 Design concerns
3.6 Semaphores
3.7 Applying semaphores in classical problems
3.8 Atomic data types
3.9 Monitors
3.10 Applying monitors in classical problems
3.11 Asynchronous threads
3.12 Dynamic vs. static thread management
3.13 Threads and fibers
3.14 Debugging multi-threaded applications
Exercises
Bibliography
Chapter 4. Parallel data structures
Abstract
4.1 Introduction
4.2 Lock-based structures
4.3 Lock-free structures
4.4 Closing remarks
Exercises
Bibliography
Chapter 5. Distributed memory programming
Abstract
5.1 Introduction
5.2 MPI
5.3 Core concepts
5.4 Your first MPI program
5.5 Program architecture
5.6 Point-to-point communication
5.7 Alternative point-to-point communication modes
5.8 Non-blocking communications
5.9 Point-to-point communications: summary
5.10 Error reporting & handling
5.11 Collective communications
5.12 Persistent communications
5.13 Big-count communications in MPI 4.0
5.14 Partitioned communications
5.15 Communicating objects
5.16 Node management: communicators and groups
5.17 One-sided communication
5.18 I/O considerations
5.19 Combining MPI processes with threads
5.20 Timing and performance measurements
5.21 Debugging, profiling, and tracing MPI programs
5.22 The Boost.MPI library
5.23 A case study: diffusion-limited aggregation
5.24 A case study: brute-force encryption cracking
5.25 A case study: MPI implementation of the master–worker pattern
Exercises
Bibliography
Chapter 6. GPU programming: CUDA
Abstract
6.1 Introduction
6.2 CUDA’s programming model: threads, blocks, and grids
6.3 CUDA’s execution model: streaming multiprocessors and warps
6.4 CUDA compilation process
6.5 Putting together a CUDA project
6.6 Memory hierarchy
6.7 Optimization techniques
6.8 Graphs
6.9 Warp functions
6.10 Cooperative groups
6.11 Dynamic parallelism
6.12 Debugging CUDA programs
6.13 Profiling CUDA programs
6.14 CUDA and MPI
6.15 Case studies
Exercises
Bibliography
Chapter 7. GPU and accelerator programming: OpenCL
Abstract
7.1 The OpenCL architecture
7.2 The platform model
7.3 The execution model
7.4 The programming model
7.5 The memory model
7.6 Shared virtual memory
7.7 Atomics and synchronization
7.8 Work group functions
7.9 Events and profiling OpenCL programs
7.10 OpenCL and other parallel software platforms
7.11 Case study: Mandelbrot set
Exercises
Part 3: Higher-level parallel programming
Chapter 8. Shared-memory programming: OpenMP
Abstract
8.1 Introduction
8.2 Your first OpenMP program
8.3 Variable scope
8.4 Loop-level parallelism
8.5 Task parallelism
8.6 Synchronization constructs
8.7 Cancellation constructs
8.8 SIMD extensions
8.9 Offloading to devices
8.10 The loop construct
8.11 Thread affinity
8.12 Correctness and optimization issues
8.13 A case study: sorting in OpenMP
8.14 A case study: brute-force encryption cracking, combining MPI and OpenMP
Exercises
Bibliography
Chapter 9. High-level multi-threaded programming with the Qt library
Abstract
9.1 Introduction
9.2 Implicit thread creation
9.3 Qt’s pool of threads
9.4 Higher-level constructs – multi-threaded programming without threads!
Exercises
Bibliography
Chapter 10. The Thrust template library
Abstract
10.1 Introduction
10.2 First steps in Thrust
10.3 Working with Thrust datatypes
10.4 Thrust algorithms
10.5 Fancy iterators
10.6 Switching device back-ends
10.7 Thrust execution policies and asynchronous execution
10.8 Case studies
Exercises
Bibliography
Part 4: Advanced topics
Chapter 11. Load balancing
Abstract
11.1 Introduction
11.2 Dynamic load balancing: the Linda legacy
11.3 Static load balancing: the divisible load theory approach
11.4 DLTLib: a library for partitioning workloads
11.5 Case studies
People also search for Multicore and GPU Programming An Integrated Approach 2nd Edition:
multicore programming c++
quad core gpu
r multicore
r multicore windows
x-core programming
Tags: Gerassimos Barlas, GPU Programming, Integrated Approach, Multicore