- Introduction and brief summary of topics from Optimisation Techniques Part I
- Further information on the hardware architecture
- Sockets, Cores, Caches and NUMA
- InfiniBand network
- Arithmetic intensity and the roofline model
- Process and thread affinity
- Advanced compiler options
- Requirements for vectorizable loops
- Architecture-specific optimisation
- Precision and reproducibility
- Options for correctness checking and debugging
- MPI Optimisation
- Gathering communication statistics
- Improving MPI communication
- A valid user account on NSCC system, ASPIRE1
- Laptop for use in hands-on sessions
- Familiarity with topics covered by Introductory Class (connecting to system, editing files in Linux and submitting jobs)
- Familiarity with topics covered by Optimisation Techniques I
After this course, a user should have a good understanding of the factors which limit the performance of compute-intensive applications and knowledge of the techniques which can be used to improve that performance.