[slurm-users] Sharing a node with non-gres and gres jobs

Tue Mar 19 12:47:48 UTC 2019

On Tue, Mar 19, 2019 at 8:34 AM Peter Steinbach <steinbac at mpi-cbg.de> wrote:
>
> Hi,
>
> we are struggling with a slurm 18.08.5 installation of ours. We are in a
> situation, where our GPU nodes have a considerable number of cores but
> "only" 2 GPUs inside. While people run jobs using the GPUs, non-GPU jobs
> can enter alright. However, we found out the hard way, that the inverse
> is not true.
>
> For example, let's say I have a 4-core GPU node called gpu1. A non-GPU job
> $ sbatch --wrap="sleep 10 && hostname" -c 3
> comes in and starts running on gpu1.
> We observed that the job produced by the following command targetting
> the same node:
> $ sbatch --wrap="hostname" -c 1 --gres=gpu:1 -w gpu1
> will wait indefinitely for available resources until the non-gpu job is
> finished. This is not something we want.
>
> The sample gres.conf and slurm.conf from a docker based slurm cluster
> where I was able to reproduce the issue are available here:
> https://raw.githubusercontent.com/psteinb/docker-centos7-slurm/18.08.5-with-gres/slurm.conf
> https://raw.githubusercontent.com/psteinb/docker-centos7-slurm/18.08.5-with-gres/gres.conf
>
> We are not sure how to handle the situation as we would like both jobs
> to enter the gpu node and run at the same time to maximize the utility
> of our hardware to our users.
>
> Any hints or ideas are highly appreciated.
> Thanks for your help,
> Peter
>

You don't mention your slurm's SelectTypeParameters, which by default
I believe will schedule whole nodes to jobs. We have CR_LLN set in
ours which will spread jobs across nodes instead. My memory is a
little foggy on the details but there is a lot of configuration
possible via SelectType, SelectTypeParameters and the Schedular. Read
up on them in the slurm.conf man or the schedmd website.