[slurm-users] sbatch overallocation

Renfro, Michael Renfro at tntech.edu
Sat Oct 10 16:53:45 UTC 2020


I think the answer depends on why you’re trying to prevent the observed behavior:


  *   Do you want to ensure that one job requesting 9 tasks (and 1 CPU per task) can’t overstep its reservation and take resources away from other jobs on those nodes? Cgroups [1] should be able to confine the job to its 9 CPUs, and even if 8 processes get started at once in the job, they’ll only drive up the nodes’ load average, and not affect others’ performance.
  *   Are you trying to define a workflow where these 8 jobs can be run in parallel, and you want to wait until they’ve all completed before starting another job? Job dependencies using the --dependency flag to sbatch [2] should be able to handle that.

[1] https://slurm.schedmd.com/cgroups.html
[2] https://slurm.schedmd.com/sbatch.html

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Max Quast <max at quast.de>
Reply-To: Slurm User Community List <slurm-users at lists.schedmd.com>
Date: Saturday, October 10, 2020 at 6:06 AM
To: <slurm-users at lists.schedmd.com>
Subject: [slurm-users] sbatch overallocation

Dear slurm-users,

I built a slurm system consisting of two nodes (Ubuntu 20.04.1, slurm 20.02.5):

                # COMPUTE NODES
                GresTypes=gpu
                NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073 Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN
                PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE State=UP

The slurmctl is running on a separate Ubuntu system where no slurmd is installed.

If a user executes this script (sbatch srun2.bash)

                #!/bin/bash
                #SBATCH -N 2 -n9
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-10 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-11 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-12 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-13 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-14 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-15 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-16 -parallel > /dev/null &
                srun pimpleFoam -case /mnt/NFS/users/quast/channel395-17 -parallel > /dev/null &
                wait

8 jobs with 9 threads are launched and distributed on two nodes.

If more such scripts get started at the same time, all the srun commands will be executed even though no free cores are available. So the nodes are overallocated.
How can this be prevented?

Thx :)

Greetings
max

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201010/904aacd2/attachment-0001.htm>


More information about the slurm-users mailing list