High-Performance Computing (WS 2020/21)

This course will be given in English. Due to ongoing Corona Pandemic this course will be held as a pure online course without any presence requirements.

Subject and Goal

The goal of this course is to teach the fundamentals of high-performance computing. That is, we will discuss programming models, languages and frameworks for efficiently using parallel computer sytems. The lecture will be complemented by a considerable amount of practical programming exercises that allow the students to gain practical experiences with programming, performance optimization and debugging parallel computer systems. To this end, the student will get access to the HPC clusters operated by the Paderborn Center for Parallel Computing (PC²).

The tenative list of lecture topics is as follows:

  1. Introduction
  2. Relevant fundamentals of computer architecture (SIMD, caches)
  3. Model for performance (roofline, weak/strong scaling, work-span span, LogP)
  4. Applications (examples that willl be used throughout the lecture: n-body, stencils, dense and sparse linear algebra, conjugate gradient/matrix multiplication)
  5. Single node optimization (cache blocking, memory access pattern, vectorization)
  6. MPI basics (SPMD, communicator, messages, network, two-sided communication block)
  7. MPI next steps (collectives, one-sided, non-blocking, derived data types)
  8. OpenMP basics (threading, work-sharing, parallel for, scheduling)
  9. OpenMP tasking (task, dependencies, taskloops, ...)
  10. Libraries (BLAS, LAPACK)
  11. Performance engineering (profiling, bottleneck analysis)

Optional topics

  1. Patterns of parallel programming (map, reduce, gather, Berkeley dwarfs, ...)
  2. HW Accelerators (application acceleration with GPUs and/or FPGAs)

 

Dates and Materials

All materials (slides, exercises, programming exercises) and current information will be provided via the PANDA learning management system, see PANDA page for this course.

Parts of the lecture and exercises will losely based on the textbook Peter S. Pacheco, An introduction to Parallel Programming, Morgan Kaufmann publishers, 2011. The book is available online within the Paderborn University Network (use VPN for DFN-AAI for access from outside). The book also comprises a number of code excerpts from programs that illustrate the use of the parallel programming techniques introduced in the book. The source code for these examples is available here.

Prerequisites

This lecture assumes that you are familiar with the foundations of modern computer architecture and performance oriented, low-level programming in C and basic C++. You can check whether you have the required prerequisites in an online self-assessment test.

Exam

The exam in High-Performance Computing is in written form. There will be two exams per year right after the course, one in the first examination period and another one in the second examination period.