[slurm-users] Multiple Program Runs using srun in one Slurm batch Job on one node

Ward Poelmans ward.poelmans at vub.be
Wed Jun 15 15:25:52 UTC 2022


Hi Guillaume,

On 15/06/2022 16:59, Guillaume De Nayer wrote:
> 
> Perhaps I missunderstand the Slurm documentation...
> 
> As thought that the --exclusive option used in combination with sbatch
> will reserve the whole node (40 cores) for the job (submitted with
> sbatch). This part is working fine. I can check it with sacct.
> 
> Then, this job starts subtasks on the reserved 40 cores with srun.
> Therefore I'm using "-n1 -c1" in combination with "srun". I thought that
> it was possible to use the reserved cores inside this job using srun.

You're correct. --exclusive will give you all cores on the nodes but only as much memory as requested.

  
> The following slightly modified job without --exclusive and with
> --ntasks=2 leads to a similar problem: Only one srun is running at a
> time. The second starts directly after the first one finished.
> 
> #!/bin/bash
> #SBATCH --job-name=test_multi_prog_srun
> #SBATCH --ntasks=2
> #SBATCH --partition=short
> #SBATCH --time=02:00:00
> 
> srun -vvv --exact -n1 -c1 sleep 20 > srun1.log 2>&1 &
> srun -vvv --exact -n1 -c1 sleep 30 > srun2.log 2>&1 &
> wait

This should work... It works on our cluster. Are you sure they don't run in parallel?

We usually recommend to use gnu parallel or xargs like:

xargs -P $SLURM_NTASKS srun -N 1 -n 1 -c 1 --exact sleep 30


Ward
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4716 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220615/57257f96/attachment.bin>


More information about the slurm-users mailing list