Tuesday, 8 May 2018

Multi - Core Architectures and Programming - Lecture Notes, Study Materials and Important questions answers



Multi - Core Architectures and Programming
- Lecture Notes, Study Materials and Important questions answers




Subject : Multi - Core Architectures and Programming

An Introduction to Parallel Programming by Peter S Pacheco

Chapter 1 Why Parallel Computing

  1. Why Parallel Computing? - Answer (click here)
  2. Why We Need Ever-Increasing Performance - Answer (click here)
  3. Why We’re Building Parallel Systems - Answer (click here)
  4. Why we Need to Write Parallel Programs - Answer (click here)
  5. How Do We Write Parallel Programs? - Answer (click here)
  6. Concurrent, Parallel, Distributed - Answer (click here)

Chapter 2 Parallel Hardware and Parallel Software

  1. Parallel Hardware and Parallel Software - Answer (click here)
  2. Some Background: von Neumann architecture, Processes, multitasking, and threads - Answer (click here)
  3. Modifications to the Von Neumann Model - Answer (click here)
  4. Parallel Hardware - Answer (click here)
  5. Parallel Software - Answer (click here)
  6. Input and Output - Answer (click here)
  7. Performance of Parallel Programming - Answer (click here)
  8. Parallel Program Design with example - Answer (click here)
  9. Writing and Running Parallel Programs - Answer (click here)
  10. Assumptions - Parallel Programming - Answer (click here)

Chapter 3 Distributed Memory Programming with MPI

  1. Distributed-Memory Programming with MPI - Answer (click here)
  2. The Trapezoidal Rule in MPI - Answer (click here)
  3. Dealing with I/O - Answer (click here)
  4. Collective Communication - Answer (click here)
  5. MPI Derived Datatypes - Answer (click here)
  6. Performance Evaluation of MPI Programs - Answer (click here)
  7. A Parallel Sorting Algorithm - Answer (click here)

Chapter 4 Shared Memory Programming with Pthreads

  1. Shared-Memory Programming with Pthreads - Answer (click here)
  2. Processes, Threads, and Pthreads - Answer (click here)
  3. Pthreads - Hello, World Program - Answer (click here)
  4. Matrix-Vector Multiplication - Answer (click here)
  5. Critical Sections - Answer (click here)
  6. Busy-Waiting - Answer (click here)
  7. Mutexes - Answer (click here)
  8. Producer-Consumer Synchronization and Semaphores - Answer (click here)
  9. Barriers and Condition Variables - Answer (click here)
  10. Read-Write Locks - Answer (click here)
  11. Caches, Cache Coherence, and False Sharing - Answer (click here)
  12. Thread-Safety - Answer (click here)
  13. Shared-Memory Programming with OpenMP - Answer (click here)
  14. The Trapezoidal Rule - Answer (click here)
  15. Scope of Variables - Answer (click here)
  16. The Reduction Clause - Answer (click here)
  17. The parallel For Directive - Answer (click here)
  18. More About Loops in Openmp: Sorting - Answer (click here)
  19. Scheduling Loops - Answer (click here)
  20. Producers and Consumers - Answer (click here)
  21. Caches, Cache Coherence, and False Sharing - Answer (click here)
  22. Thread-Safety - Answer (click here)
  23. Parallel Program Development - Answer (click here)
  24. Two n-Body Solvers - Answer (click here)
  25. Parallelizing the basic solver using OpenMP - Answer (click here)
  26. Parallelizing the reduced solver using OpenMP - Answer (click here)
  27. Evaluating the OpenMP codes - Answer (click here)
  28. Parallelizing the solvers using pthreads - Answer (click here)
  29. Parallelizing the basic solver using MPI - Answer (click here)
  30. Parallelizing the reduced solver using MPI - Answer (click here)
  31. Performance of the MPI solvers - Answer (click here)
  32. Tree Search - Answer (click here)
  33. Recursive depth-first search - Answer (click here)
  34. Nonrecursive depth-first search - Answer (click here)
  35. Data structures for the serial implementations - Answer (click here)
  36. Performance of the serial implementations - Answer (click here)
  37. Parallelizing tree search - Answer (click here)
  38. A static parallelization of tree search using pthreads - Answer (click here)
  39. A dynamic parallelization of tree search using pthreads - Answer (click here)
  40. Evaluating the Pthreads tree-search programs - Answer (click here)
  41. Parallelizing the tree-search programs using OpenMP - Answer (click here)
  42. Performance of the OpenMP implementations - Answer (click here)
  43. Implementation of tree search using MPI and static partitioning - Answer (click here)
  44. Implementation of tree search using MPI and dynamic partitioning - Answer (click here)
  45. Which API? - Answer (click here)

Multicore Application Programming For Windows Linux and Oracle Solaris by Darryl Gove

Chapter 1 Hardware and Processes and Threads

  1. Hardware, Processes, and Threads - Answer (click here)
  2. Examining the Insides of a Computer - Answer (click here)
  3. The Motivation for Multicore Processors - Answer (click here)
  4. Supporting Multiple Threads on a Single Chip - Answer (click here)
  5. Increasing Instruction Issue Rate with Pipelined Processor Cores - Answer (click here)
  6. Using Caches to Hold Recently Used Data - Answer (click here)
  7. Using Virtual Memory to Store Data - Answer (click here)
  8. Translating from Virtual Addresses to Physical Addresses - Answer (click here)
  9. The Characteristics of Multiprocessor Systems - Answer (click here)
  10. How Latency and Bandwidth Impact Performance - Answer (click here)
  11. The Translation of Source Code to Assembly Language - Answer (click here)
  12. The Performance of 32-Bit versus 64-Bit Code - Answer (click here)
  13. Ensuring the Correct Order of Memory Operations - Answer (click here)
  14. The Differences Between Processes and Threads - Answer (click here)

Chapter 2 Coding for Performance

  1. Coding for Performance - Answer (click here)
  2. Defining Performance - Answer (click here)
  3. Understanding Algorithmic Complexity - Answer (click here)
  4. Why Algorithmic Complexity Is Important - Answer (click here)
  5. Using Algorithmic Complexity with Care - Answer (click here)
  6. How Structure Impacts Performance - Answer (click here)
  7. Performance and Convenience Trade-Offs in Source Code and Build Structures - Answer (click here)
  8. Using Libraries to Structure Applications - Answer (click here)
  9. The Impact of Data Structures on Performance - Answer (click here)
  10. The Role of the Compiler - Answer (click here)
  11. The Two Types of Compiler Optimization - Answer (click here)
  12. Selecting Appropriate Compiler Options - Answer (click here)
  13. How Cross-File Optimization Can Be Used to Improve Performance - Answer (click here)
  14. Using Profile Feedback - Answer (click here)
  15. How Potential Pointer Aliasing Can Inhibit Compiler Optimizations - Answer (click here)
  16. Identifying Where Time Is Spent Using Profiling - Answer (click here)
  17. Commonly Available Profiling Tools - Answer (click here)
  18. How Not to Optimize - Answer (click here)
  19. Performance by Design - Answer (click here)

Chapter 3 Identifying Opportunities for Parallelism

  1. Identifying Opportunities for Parallelism - Answer (click here)
  2. Using Multiple Processes to Improve System Productivity - Answer (click here)
  3. Multiple Users Utilizing a Single System - Answer (click here)
  4. Improving Machine Efficiency Through Consolidation - Answer (click here)
  5. Using Containers to Isolate Applications Sharing a Single System - Answer (click here)
  6. Hosting Multiple Operating Systems Using Hypervisors - Answer (click here)
  7. Using Parallelism to Improve the Performance of a Single Task - Answer (click here)
  8. One Approach to Visualizing Parallel Applications - Answer (click here)
  9. How Parallelism Can Change the Choice of Algorithms - Answer (click here)
  10. Amdahl’s Law - Answer (click here)
  11. Determining the Maximum Practical Threads - Answer (click here)
  12. How Synchronization Costs Reduce Scaling - Answer (click here)
  13. Parallelization Patterns - Answer (click here)
  14. Data Parallelism Using SIMD Instructions - Answer (click here)
  15. Parallelization Using Processes or Threads - Answer (click here)
  16. Multiple Independent Tasks - Answer (click here)
  17. Multiple Loosely Coupled Tasks - Answer (click here)
  18. Multiple Copies of the Same Task - Answer (click here)
  19. Single Task Split Over Multiple Threads - Answer (click here)
  20. Using a Pipeline of Tasks to Work on a Single Item - Answer (click here)
  21. Division of Work into a Client and a Server - Answer (click here)
  22. Splitting Responsibility into a Producer and a Consumer - Answer (click here)
  23. Combining Parallelization Strategies - Answer (click here)
  24. How Dependencies Influence the Ability Run Code in Parallel - Answer (click here)
  25. Antidependencies and Output Dependencies - Answer (click here)
  26. Using Speculation to Break Dependencies - Answer (click here)
  27. Critical Paths - Answer (click here)
  28. Identifying Parallelization Opportunities - Answer (click here)

Chapter 4 Synchronization and Data Sharing

  1. Synchronization and Data Sharing - Answer (click here)
  2. Data Races - Answer (click here)
  3. Using Tools to Detect Data Races - Answer (click here)
  4. Avoiding Data Races - Answer (click here)
  5. Synchronization Primitives - Answer (click here)
  6. Mutexes and Critical Regions - Answer (click here)
  7. Spin Locks - Answer (click here)
  8. Semaphores - Answer (click here)
  9. Readers-Writer Locks - Answer (click here)
  10. Barriers - Answer (click here)
  11. Atomic Operations and Lock-Free Code - Answer (click here)
  12. Deadlocks and Livelocks - Answer (click here)
  13. Communication Between Threads and Processes - Answer (click here)
  14. Storing Thread-Private Data - Answer (click here)

Chapter 5 Using POSIX Threads

  1. Using POSIX Threads - Answer (click here)
  2. Creating Threads - Answer (click here)
  3. Compiling Multithreaded Code - Answer (click here)
  4. Process Termination - Answer (click here)
  5. Sharing Data Between Threads - Answer (click here)
  6. Variables and Memory - Answer (click here)
  7. Multiprocess Programming - Answer (click here)
  8. Sockets - Answer (click here)
  9. Reentrant Code and Compiler Flags - Answer (click here)
  10. Windows Threading - Answer (click here)

Chapter 6 Windows Threading

  1. Creating Native Windows Threads - Answer (click here)
  2. Terminating Threads - Answer (click here)
  3. Creating and Resuming Suspended Threads - Answer (click here)
  4. Using Handles to Kernel Resources - Answer (click here)
  5. Methods of Synchronization and Resource Sharing - Answer (click here)
  6. An Example of Requiring Synchronization Between Threads - Answer (click here)
  7. Protecting Access to Code with Critical Sections - Answer (click here)
  8. Protecting Regions of Code with Mutexes - Answer (click here)
  9. Slim Reader/Writer Locks - Answer (click here)
  10. Signaling Event Completion to Other Threads or Processes - Answer (click here)
  11. Wide String Handling in Windows - Answer (click here)
  12. Creating Processes - Answer (click here)
  13. Sharing Memory Between Processes - Answer (click here)
  14. Inheriting Handles in Child Processes - Answer (click here)
  15. Naming Mutexes and Sharing Them Between Processes - Answer (click here)
  16. Communicating with Pipes - Answer (click here)
  17. Communicating Using Sockets - Answer (click here)
  18. Atomic Updates of Variables - Answer (click here)
  19. Allocating Thread-Local Storage - Answer (click here)
  20. Setting Thread Priority - Answer (click here)

Chapter 7 Using Automatic Parallelization and OpenMP

  1. Using Automatic Parallelization and OpenMP - Answer (click here)
  2. Using Automatic Parallelization to Produce a Parallel Application - Answer (click here)
  3. Identifying and Parallelizing Reductions - Answer (click here)
  4. Automatic Parallelization of Codes Containing Calls - Answer (click here)
  5. Assisting Compiler in Automatically Parallelizing Code - Answer (click here)
  6. Using OpenMP to Produce a Parallel Application - Answer (click here)
  7. Using OpenMP to Parallelize Loops - Answer (click here)
  8. Runtime Behavior of an OpenMP Application - Answer (click here)
  9. Variable Scoping Inside OpenMP Parallel Regions - Answer (click here)
  10. Parallelizing Reductions Using OpenMP - Answer (click here)
  11. Accessing Private Data Outside the Parallel Region - Answer (click here)
  12. Improving Work Distribution Using Scheduling - Answer (click here)
  13. Using Parallel Sections to Perform Independent Work - Answer (click here)
  14. Nested Parallelism - Answer (click here)
  15. Using OpenMP for Dynamically Defined Parallel Tasks - Answer (click here)
  16. Keeping Data Private to Threads - Answer (click here)
  17. Controlling the OpenMP Runtime Environment - Answer (click here)
  18. Waiting for Work to Complete - Answer (click here)
  19. Restricting the Threads That Execute a Region of Code - Answer (click here)
  20. Ensuring That Code in a Parallel Region Is Executed in Order - Answer (click here)
  21. Collapsing Loops to Improve Workload Balance - Answer (click here)
  22. Enforcing Memory Consistency - Answer (click here)
  23. An Example of Parallelization - Answer (click here)

Chapter 8 Hand Coded Synchronization and Sharing

  1. Hand-Coded Synchronization and Sharing - Answer (click here)
  2. Atomic Operations - Answer (click here)
  3. Using Compare and Swap Instructions to Form More Complex Atomic Operations - Answer (click here)
  4. Enforcing Memory Ordering to Ensure Correct Operation - Answer (click here)
  5. Compiler Support of Memory-Ordering Directives - Answer (click here)
  6. Reordering of Operations by the Compiler - Answer (click here)
  7. Volatile Variables - Answer (click here)
  8. Operating System–Provided Atomics - Answer (click here)
  9. Lockless Algorithms - Answer (click here)
  10. Dekker’s Algorithm - Answer (click here)
  11. Producer-Consumer with a Circular Buffer - Answer (click here)
  12. Scaling to Multiple Consumers or Producers - Answer (click here)
  13. Scaling the Producer-Consumer to Multiple Threads - Answer (click here)
  14. Modifying the Producer-Consumer Code to Use Atomics - Answer (click here)
  15. The ABA Problem - Answer (click here)

Chapter 9 Scaling with Multicore Processors

  1. Scaling with Multicore Processors - Answer (click here)
  2. Constraints to Application Scaling - Answer (click here)
  3. Hardware Constraints to Scaling - Answer (click here)
  4. Bandwidth Sharing Between Cores - Answer (click here)
  5. False Sharing - Answer (click here)
  6. Cache Conflict and Capacity - Answer (click here)
  7. Pipeline Resource Starvation - Answer (click here)
  8. Operating System Constraints to Scaling - Answer (click here)
  9. Multicore Processors and Scaling - Answer (click here)

Chapter 10 Other Parallelization Technologies

  1. Other Parallelization Technologies - Answer (click here)
  2. GPU-Based Computing - Answer (click here)
  3. Language Extensions - Answer (click here)
  4. Alternative Languages - Answer (click here)
  5. Clustering Technologies - Answer (click here)
  6. Transactional Memory - Answer (click here)
  7. Vectorization - Answer (click here)

No comments:

Post a Comment