wiki:Public/User_Guide/Batch_system

Version 8 (modified by Peter Niessen, 7 years ago) (diff)

Information about the batch system (SLURM)

For the old torque documentation, please see the old documentation.

Please confer /etc/slurm/README.

Overview

Available Partitions

Interactive Jobs

Batch Jobs

FAQ

Why's my job not running?

How can I check which jobs are running in the machine?

How do I do chain jobs with dependencies?

How can get a list of broken nodes?

Can I still use the old DEEP Booster nodes?

Can I join stderr and stdout like it was done with -joe in Torque?

What's the equivalent of qsub -l nodes=x:ppn=y:cluster+n_b:ppn=p_b:booster?

pbs/slurm dictionary

PBS command closest slurm equivalent
qsub sbatch
qsub -I salloc + srun —pty bash -i
qsub into an existing reservation … —reservation= <reservation> …
pbsnodes scontrol show node
pbsnodes (-ln) sinfo (-R) or sinfo -Rl -h -o "%n %12U %19H %6t %E" | sort -u
pbsnodes -c -N n <node> scontrol update NodeName?= <node> State=RESUME
pbsnodes -o <node> scontrol update NodeName?= <node> State=DRAIN reason="some comment here"
pbstop smap
qstat squeue
checkjob <job> scontrol show job <job>
checkjob -v <job> scontrol show -d job <job>
showres scontrol show res
setres scontrol create reservation [ReservationName?= <reservation>] user=partec Nodes=j3c![053-056] StartTime?=now duration=Unlimited Flags=IGNORE_JOBS
setres -u <user> ALL scontrol create reservation ReservationName?=\<some name> user=\<user> Nodes=ALL startTime=now duration=unlimited FLAGS=maint,ignore_jobs
releaseres scontrol delete ReservationName?= <reservation>