Over 10 mio. titler Fri fragt ved køb over 499,- Hurtig levering Forlænget returret til 31/01/25

Structured Parallel Programming

- Patterns for Efficient Computation

Bog
  • Format
  • Bog, paperback
  • Engelsk

Beskrivelse

Structured Parallel Programming offers the simplest way for developers to learn patterns for high-performance parallel programming. Written by parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders, this book explains how to design and implement maintainable and efficient parallel algorithms using a composable, structured, scalable, and machine-independent approach to parallel computing. It presents both theory and practice, and provides detailed concrete examples using multiple programming models. The examples in this book are presented using two of the most popular and cutting edge programming models for parallel programming: Threading Building Blocks, and Cilk Plus. These architecture-independent models enable easy integration into existing applications, preserve investments in existing code, and speed the development of parallel applications. Examples from realistic contexts illustrate patterns and themes in parallel algorithm design that are widely applicable regardless of implementation technology. Software developers, computer programmers, and software architects will find this book extremely helpful.

Læs hele beskrivelsen
Detaljer
Størrelse og vægt
  • Vægt770 g
  • coffee cup img
    10 cm
    book img
    19,1 cm
    23,5 cm

    Findes i disse kategorier...

    Se andre, der handler om...

    Performance Power Scan Map Future Cancellation Search Efficiency Organization Scalability Parallel algorithms Analysis Grain Computer architecture Computer organization Safety Patterns Fibers Process Selection Conventions Collection Fusion Locks Recurrence Reduction Thread Dag Strand Pack Pattern sort Consumer Three-phase Span OpenMP cluster Recursion Séquence Collision Multiprocessor Systems Scaling False sharing Simd Deterministic SIMT Multiprocessor Merge Mechanisms Divide-and-Conquer Programming models Pipeline Hyperobjects BARRIER Tiling Codec Shift Transaction Memory Hierarchy Auto Many-core Bandwidth Branch and bound Spawn Locality Stencil Segmentation Move Lambda Stratégies Complexity Vectorization Data locality Container Deadlock Iteration Commutative Graph rewriting Nesting Stage Performance models MPI Object Precision Reduce Functional Decomposition Quicksort K-means Commutativity Data parallelism API Associativity Multicore Partition Closure Parallel programming models Grid C11 Array GPU Fork Objects Fiber Overhead Latency Throughput Compression Non-deterministic Task Gather Join Lock Mutex Scatter Blas Expand affinity partitioner (in TBB) ArBB) arithmetic intensity Array Notation application programming interface (see API) associative associative operation atomic operation asymptotic complexity ArBB (Array Building Blocks) Basic Linear Algebra Subroutines Array Building Blocks Binomial Lattice asymptotic speedup cache (issues with) atomic operations blocked (see also tiled) blocked (see tiled) Cilk Plus syntax for fork-join associative operator atomic scatter condition variables Cholesky Decomposition Cilk Plus category reduction composability data reorganization cilk_spawn Brent's lemma bzip2 Data layout cache oblivious embarrassing parallelism enumerable_thread_specific Cilk Quicksort cilk_sync Explicit parallel programming (need for) cilk_for Cilk Plus Quicksort fork-join pattern commutative operator compare and swap elemental function consumer-reducer Geometric Decomposition hardware thread fork-join historical trends flat fork-join heap allocation Gustafson-Barsis' law Intel Threading Building Blocks (see Threading Building Blocks Directed Acyclic Graph hyperthread Intel Cilk Plus (see Cilk Plus) Intel Cilk Plus (see also Cilk Plus) Intel Concurrent Collections irregular parallelism empty_task Little's Law instruction parallelism Intel Threading Building Blocks (see also Threading Building Blocks merge scatter Karatsuba multiplication map pattern Lambda Function Little's formula Lloyd relaxation machine model mandatory parallelism Merge Sort Granularity offload Hyperobject parallel merge parallel pattern parallel slack parallel building blocks optional parallelism random read parallel patterns Race condition potential parallelism random memory access pervasive parallelism recursive linear algebra instruction level parallelism (ILP) regular parallelism Intel Array Building Blocks (see also Array Building Blocks serial traps producer-consumer span complexity random write Speedup reordering iterated stencil structured (pattern-based) programming TBB Quicksort translation lookaside buffer Intel Array Building Blocks (see Array Building Blocks vector operation Vector parallelism TBB syntax for fork-join thread parallelism vectorize thread local storage space-time tiling vectorize zip OpenCL parallel pipeline MIMD pragma simd Threading Building Blocks reducers memory subsystem scalable memory allocator priority scatter reducer tiled triangular solve permutation scatter superscalar sequences symmetric rank update speculative selection translation lookaside buffer (see TLB) stack allocation superscalar sequence TLB
    Machine Name: SAXO084