Example Scripts for Running Jobs

Status
Yes

Here is a list of simple templates that you can use for submitting jobs to the queue. All scipts can be submitted by

qsub name_of_your_script

Basic Script:

#!/bin/bash
#PBS -l nodes=2:ppn=12
#PBS -l walltime=2:00:00
#PBS -V

cd $PBS_O_WORKDIR

mpirun -np 24 -machinefile $PBS_NODEFILE ./run.x

This script is used for running the executable run.x which is compiler using MPI. This script enables a run across 2 nodes on 24 processors.

Serial Job:

If you have a serial job, the following script can be used

#!/bin/bash
#PBS -l nodes=1:ppn=12
#PBS -l walltime=2:00:00
#PBS -V

cd $PBS_O_WORKDIR

./run.x

The serial executable run.x will run on a single compute node, making use of only one processor out of 12 but taking advantage of the whole memory of the node.

Job with Shared Memory Parallelizm (e.g. OpenMP)

If you have an executable with shared memory parallelization, you can use

#!/bin/bash
#PBS -l nodes=1:ppn=12
#PBS -l walltime=2:00:00
#PBS -V

cd $PBS_O_WORKDIR

# You can use larger number of threads, but test speed before doing so
export OMP_NUM_THREADS=12 

./run.x

The executable run.x will run on 12 cores on the shared memory of the node.

Running Multiple Parallel Jobs Sequentially:

Sometimes you may need to run several parallel jobs one after another (i.e. a case where the second run needs output from the first one). In this case you can use a script like the following one:

#!/bin/bash
#PBS -l nodes=2:ppn=24
#PBS -l walltime=2:00:00
#PBS -V

cd $PBS_O_WORKDIR

mpirun -np 24 -machinefile $PBS_NODEFILE ./run1.x

mpirun -np 24 -machinefile $PBS_NODEFILE ./run2.x

mpirun -np 24 -machinefile $PBS_NODEFILE ./run3.x

run3.x will run only after run2.x is completed, and run2.x will only run after run1.x is completed.

Running Multiple Jobs Simultaneously

In certain applications, you may have a serial/parallel code that can execute multiple instances of data simultaneously. You can run several executables simultaneously by the following script

#!/bin/bash
#PBS -l nodes=1:ppn=12
#PBS -l walltime=2:00:00
#PBS -V

cd $PBS_O_WORKDIR

./run1.x &

./run2.x &

./run3.x &

./run4.x &

wait

In this case four serial jobs will run simultaneously, sharing the resources of the 12 core node. The & sign makes sure that runs do not wait for others to end. The final wait command is essential to make sure that the job does not quit before all runs are completed. A similar script can be constructed for simultaneous parallel jobs (replace ./run1-4.x & with mpirun-machninefile $PBS_NODEFILE -np 12./run1-4.x &)

Available Queues:

  • Short queue: For jobs that run less than one hour, less waiting in the queue
qsub -q short name_of_your_script
  • Large memory: For jobs that require very large memory; largemem: 256 GB/node and xlargemem: 512 GB/node.
qsub -q largemem name_of_your_script

qsub -q xlargemem name_of_your_script
  • GPU node: For Jobs that use GPUs
qsub -q gpuq name_of_your_script