| 321 | == Workflows == |
| 322 | |
| 323 | The new version of the installed slurm now supports workflows. The idea is to have an overlap between the dependent jobs so that they can communicate the data over the network instead of writing and reading it on storage. To enable the workflows, we have introduced a new switch {{{delay}}} to {{{sbatch}}} command. Here is a simple example script. |
| 324 | |
| 325 | {{{ |
| 326 | [huda1@deepv scripts]$ cat test.sh |
| 327 | #!/bin/sh |
| 328 | |
| 329 | NAME=$(hostname) |
| 330 | echo "$NAME: Going to sleep for $1 seconds" |
| 331 | sleep $1 |
| 332 | echo "$NAME: Awake" |
| 333 | |
| 334 | [huda1@deepv scripts]$ cat batch_workflow.sh |
| 335 | #!/bin/bash |
| 336 | #SBATCH -p sdv -N2 -t3 |
| 337 | |
| 338 | #SBATCH packjob |
| 339 | |
| 340 | #SBATCH -p sdv -N1 -t3 --delay 2 |
| 341 | |
| 342 | srun test.sh 175 |
| 343 | |
| 344 | [huda1@deepv scripts]$ |
| 345 | }}} |
| 346 | |
| 347 | In the above {{{sbatch}}} script, the usage of {{{--delay}}} can be seen. It takes thee values in minutes. The idea is to delay the corresponding job of a heterogeneous job by the provided number of minutes from the start of the first job in the job pack. After submission of this job pack, slurm divides it into separate jobs at the time of the resource reservation. So you can see multiple jobs in the output of {{{squeue}}} command. Here is the example execution of this script. |
| 348 | |
| 349 | {{{ |
| 350 | [huda1@deepv scripts]$ sbatch batch_workflow.sh |
| 351 | Submitted batch job 81458 |
| 352 | [huda1@deepv scripts]$ squeue -u huda1 |
| 353 | JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) |
| 354 | 81458 sdv batch_wo huda1 CF 0:01 2 deeper-sdv[02-03] |
| 355 | 81459 sdv batch_wo huda1 PD 0:00 1 (Reservation) |
| 356 | |
| 357 | [huda1@deepv scripts]$ |
| 358 | }}} |
| 359 | |
| 360 | Here the second job(81458) will start 2 minutes after the start of the first job(81459). Similarly, the output files will be different for each separated job in the job pack. the final outputs are: |
| 361 | {{{ |
| 362 | [huda1@deepv scripts]$ cat slurm-81458.out |
| 363 | deeper-sdv02: Going to sleep for 175 seconds |
| 364 | deeper-sdv03: Going to sleep for 175 seconds |
| 365 | deeper-sdv02: Awake |
| 366 | deeper-sdv03: Awake |
| 367 | |
| 368 | [huda1@deepv scripts]$ cat slurm-81459.out |
| 369 | deeper-sdv01: Going to sleep for 175 seconds |
| 370 | deeper-sdv01: Awake |
| 371 | |
| 372 | [huda1@deepv scripts]$ |
| 373 | }}} |
| 374 | |
| 375 | Another feature to note is that if there are multiple jobs in a job pack and any number of consecutive jobs have the same {{{delay}}} values, they are combined into a new heterogeneous job. Here is an example of such a script: |
| 376 | {{{ |
| 377 | [huda1@deepv scripts]$ cat batch_workflow_complex.sh |
| 378 | #!/bin/bash |
| 379 | |
| 380 | #SBATCH -p sdv -N 2 -t 3 |
| 381 | #SBATCH -J first |
| 382 | |
| 383 | #SBATCH packjob |
| 384 | |
| 385 | #SBATCH -p sdv -N 1 -t 3 --delay 2 |
| 386 | #SBATCH -J second |
| 387 | |
| 388 | #SBATCH packjob |
| 389 | |
| 390 | #SBATCH -p sdv -N 1 -t 2 --delay 2 |
| 391 | #SBATCH -J second |
| 392 | |
| 393 | #SBATCH packjob |
| 394 | |
| 395 | #SBATCH -p sdv -N 2 -t 3 --delay 4 |
| 396 | #SBATCH -J third |
| 397 | |
| 398 | if [ "$SLURM_JOB_NAME" == "first" ] |
| 399 | then |
| 400 | srun ./test.sh 150 |
| 401 | |
| 402 | elif [ "$SLURM_JOB_NAME" == "second" ] |
| 403 | then |
| 404 | srun ./test.sh 150 : ./test.sh 115 |
| 405 | |
| 406 | elif [ "$SLURM_JOB_NAME" == "third" ] |
| 407 | then |
| 408 | srun ./test.sh 155 |
| 409 | |
| 410 | fi |
| 411 | |
| 412 | [huda1@deepv scripts]$ |
| 413 | }}} |
| 414 | |
| 415 | Note the {{{delay}}} values for the second and third job in the script are equal. Also, note the usage of the environment variable {{{SLURM_JOB_NAME}}} in the script to decide which command to run in which job. The example execution leads to the following: |
| 416 | {{{ |
| 417 | [huda1@deepv scripts]$ sbatch batch_workflow_complex.sh |
| 418 | Submitted batch job 81460 |
| 419 | |
| 420 | [huda1@deepv scripts]$ squeue -u huda1 |
| 421 | JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) |
| 422 | 81461+0 sdv second huda1 PD 0:00 1 (Resources) |
| 423 | 81461+1 sdv second huda1 PD 0:00 1 (Resources) |
| 424 | 81463 sdv third huda1 PD 0:00 2 (Resources) |
| 425 | 81460 sdv first huda1 PD 0:00 2 (Resources) |
| 426 | |
| 427 | [huda1@deepv scripts]$ |
| 428 | }}} |
| 429 | |
| 430 | Note that the submitted heterogeneous job has been divided into a single job (81460), a job pack (81461) and again a single job (81643). Similarly, three different output files will be generated, one for each new job. |
| 431 | {{{ |
| 432 | [huda1@deepv scripts]$ cat slurm-81460.out |
| 433 | deeper-sdv03: Going to sleep for 150 seconds |
| 434 | deeper-sdv04: Going to sleep for 150 seconds |
| 435 | deeper-sdv03: Awake |
| 436 | deeper-sdv04: Awake |
| 437 | |
| 438 | [huda1@deepv scripts]$ cat slurm-81461.out |
| 439 | deeper-sdv01: Going to sleep for 150 seconds |
| 440 | deeper-sdv02: Going to sleep for 115 seconds |
| 441 | deeper-sdv02: Awake |
| 442 | deeper-sdv01: Awake |
| 443 | |
| 444 | [huda1@deepv scripts]$ cat slurm-81463.out |
| 445 | deeper-sdv01: Going to sleep for 155 seconds |
| 446 | deeper-sdv02: Going to sleep for 155 seconds |
| 447 | deeper-sdv01: Awake |
| 448 | deeper-sdv02: Awake |
| 449 | |
| 450 | [huda1@deepv scripts]$ |
| 451 | }}} |
| 452 | If a job exits earlier than the allocated time asked by the user, the corresponding reservation for this job is deleted automatically and the resources become available for the other jobs. However, users should be careful with the requested time when submitting workflows as the larger time values can delay the scheduling of the workflows depending on the situation of the resources. |