Advanced Job Management @ A*STAR II

The advanced job management training focuses on array job, job dependencies, checkpoint and restart, file stage-in/out and troubleshooting job submission issues. This will help users to more efficiently run their jobs by best utilizing the hardware.

30 September 2019

Charles Babbage Room, Level 17, Connexis South Tower, 1 Fusionopolis Way, Singapore 138632

Overview

The advanced job management training focuses on array job, job dependencies, checkpoint and restart, file stage-in/out and troubleshooting job submission issues. This will help users to more efficiently run their jobs by best utilizing the hardware.

  • Introduction
  • Job management and project info in brief
  • Job exit codes
  • MPI jobs in batch mode
    • “mpirpocs” parameter
    • MPI tight integration
  • Multithreaded jobs and OMP_NUM_THREADS in batch mode
  • Details on memory enforcement
  • Job Arrays
  • Job dependencies
  • PBS Reservations
  • Using Check pointing
  • File stage-in/out
  • Using IME
  • Troubleshooting
  • Lab Session
  • Using Compute Manager
  • Using Display Manager
  • A valid user account on NSCC system, ASPIRE1
  • Pre-installed SSH client like Putty or Moba-Xterm to connect to ASPIRE1 on user’s laptop
  • Basic understanding of Linux commands.
    1. File management
    2. “vi” editor
    3. Using “modules” in Linux
    4. Process management
  • Basic PBS Pro job management

At the end of this course, one will have a fair understanding of advanced job management such as array jobs, reservations, Job dependencies, file stage-in/out, IME and Compute, Display Manager.