Child pages
  • 1. Running job on BIRUNI Grid as HPC users

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


There is one significant different when running application on HPC clusters and desktop workstation: in HPC clusters, the computational work must be packaged into a job that contains a script specifying what resources the job will need and the commands necessary to perform the work. Finally the job must be submitted to the HPC clusters by using a software called job manager/scheduler. BIRUNI Grid uses TORQUE software to schedules and run the job on a dedication portion of the cluster. This tutorial provides a guide how user can prepare the job script, submit the job and retrieve the result.

Submitting a Job

Jobs are submitted with the qsub command:


  • -l walltime=walltime
    Maximum wallclock time the job will need. Default depends on queue, mostly 1 hour. Walltime is specified in seconds or as hh:mm:ss or mm:ss.
  • -l mem=memory
    Maximum memory per node the job will need. Default depends on queue, normally 2GB for serial jobs and the full node for parallel jobs. Memory should be specified with units, eg 500MB or 8GB
  • -l procs=num
    Total number of CPUs required. Use this if it does not matter how CPUs are grouped onto nodes - eg, for a purely-MPI job. Don't combine this with -l nodes=num or odd behavior will ensue.
  • -l nodes=num:ppn=num
    Number of nodes and number of processors per node required. Use this if you need processes to be grouped onto nodes - eg, for an MPI/OpenMP hybrid job with 4 MPI processes and 8 OpenMP threads each, use -l nodes=4:ppn=8. Don't combine this with -l procs=num or odd behavior will ensue. Default is 1 node and 1 processor per node. When using multiple nodes the job script will be executed on the first allocated node.
    Torque will set the environment variables PBS_NUM_NODES to the number of nodes requested, PBS_NUM_PPN to the value of ppn and PBS_NP to the total number of processes available to the job.


Monitoring Jobs

To see the status of a single job - or a list of specific jobs - pass the Job IDs to qstat, as in the following example: 


When you start pbstop you see something like the annotated screenshot below. You might need to resize your terminal to make it all fit: 


Canceling a Job

To kill a running job, or remove a queued job from the queue, use qdel: