Slurm Job Scheduler
The Pod cluster uses the Slurm job scheduler - it is similar to Torque, but we'll outline some differences below. There are also some nice 'cheat sheets' out there to convert from the Torque commands you know, one nice one is here
The major differences to be aware of:
- Queues are known as Partitions - you don't really care, except it means instead of the argument when submitting a job "-q short" to send something to the short (or some other) queue is now "-p short" (p for partition).
- You'll need to change the various 'PBS' variables in your script. The common ones are listed below - many others available at the link above.
- SBATCH partitions you can submit to: sbatch my.job (standard compute node) , sbatch -p short my.job (short queue, 1 hour - for testing), sbatch -p gpu my.job (GPU nodes), sbatch -p largemem my.job (large memory 1.5TB nodes)
What | Torque | Slurm |
---|---|---|
Nodes/Cores | #PBS -l nodes=1:ppn=10 | #SBATCH --nodes=1 --ntasks-per-node=10 |
Walltime | #PBS -l walltime=1:00:00 | #SBATCH --time=1:00:00 |
mail to user | #PBS -M username@ucsb.edu | #SBATCH --mail-user=user@ucsb.edu |
Mail begin/end | #PBS -m be | #SBATCH --mail-type=start,end |
Working Directory | $PBS_O_WORKDIR | $SLURM_SUBMIT_DIR |
Slurm will take all of your environment variables that your login shell has, so if you need a compiler, or Matlab, etc., do the 'module load' for it before you submit your job.
Basic to run a job is 'sbatch' (from Torque it was 'qsub'), e.g. you have a file named 'test.slurm' that looks like this (for first a serial, then a parallel job)
#!/bin/bash -l
#Serial (1 core on one node) job...
#SBATCH --nodes=1 --ntasks-per-node=1
cd $SLURM_SUBMIT_DIR
time ./a.out >& logfile
and a simple parallel (MPI) example
#!/bin/bash -l
# ask for 16 cores on two nodes
#SBATCH --nodes=2 --ntasks-per-node=16
cd $SLURM_SUBMIT_DIR
/bin/hostname
mpirun -np $SLURM_NTASKS ./a.out
Notice that the main changes from PBS are a slightly different format on choosing the number of nodes/cores and also the directory name to CD to. For MPI jobs, it's actually somewhat simplified in you don't need to give it a nodes file.
You run this job with 'sbatch test.slurm' (you used to use 'qsub')
You can check on the status with squeue (formerly 'qstat') e.g.
squeue -u $USER (to see only your jobs, 'squeue' will show every job on the system)
You can look at details with 'scontrol show job JIOBID', sort of like the old 'qstat -f' command.
To kill a job you use 'scancel -i JOBID' (formerly 'qdel JOBID')
If you want an interactive node to test some things to make sure your job will run, you can do this with
srun -N 1 -p short --ntasks-per-node=4 --pty bash (which asks for 8 cores on the short queue node, which will run for up to an hour)
Torque | Slurm |
---|---|
#PBS -J myjob | #SBATCH -J myjob |