[slurm-users] sbatch overallocation
mercan
ahmet.mercan at uhem.itu.edu.tr
Sat Oct 10 13:02:33 UTC 2020
Hi;
You can submit each pimplefoam as a seperate job. or if you realy submit
as a single job, you can use a program to run each of them as much as
cpu count such as gnu parallel:
https://www.gnu.org/software/parallel/
regards;
Ahmet M.
10.10.2020 14:05 tarihinde Max Quast yazdı:
>
> Dear slurm-users,
>
> I built a slurm system consisting of two nodes (Ubuntu 20.04.1, slurm
> 20.02.5):
>
> # COMPUTE NODES
>
> GresTypes=gpu
>
> NodeName=lsm[216-217] Gres=gpu:tesla:1 CPUs=64 RealMemory=192073
> Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN
>
> PartitionName=admin Nodes=lsm[216-217] Default=YES MaxTime=INFINITE
> State=UP
>
> The slurmctl is running on a separate Ubuntu system where no slurmd is
> installed.
>
> If a user executes this script (sbatch srun2.bash)
>
> #!/bin/bash
>
> #SBATCH -N 2 -n9
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-10 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-11 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-12 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-13 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-14 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-15 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-16 -parallel > /dev/null &
>
> srun pimpleFoam -case
> /mnt/NFS/users/quast/channel395-17 -parallel > /dev/null &
>
> wait
>
> 8 jobs with 9 threads are launched and distributed on two nodes.
>
> If more such scripts get started at the same time, all the srun
> commands will be executed even though no free cores are available. So
> the nodes are overallocated.
>
> How can this be prevented?
>
> Thx :)
>
> Greetings
>
> max
>
More information about the slurm-users
mailing list