[slurm-users] Job not running with Resource Reason even though resources appear to be available

Mon Jan 25 15:07:44 UTC 2021

I tried submitting jobs with --gres-flags=disable-binding but
this has not made any difference.  Jobs asking for GPUs are still only
being run if a core defined in gres.conf for the GPU is free.

Basically seems the option is ignored.

-- Paul Raines (http://help.nmr.mgh.harvard.edu)

On Sun, 24 Jan 2021 11:39am, Paul Raines wrote:

> Thanks Chris.
>
> I think you have identified the issue here or are very close.  My gres.conf 
> on
> the rtx-04 node for example is:
>
> AutoDetect=nvml
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia0 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia1 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia2 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia3 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia4 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia5 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia6 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia7 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia8 Cores=0-15
> Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia9 Cores=0-15
>
> There are 32 cores (HT is off).  But the daughter card that holds all
> 10 of the RTX8000s connects to only one socket as can be seen from
> 'nvidia-smi topo -m'
>
> Its odd though in that my tests on my identically configured
> rtx6000 partition did not show that behavior but maybe it is
> due to just the "random" cores that got assigned to jobs there
> all having a least one core on the "right" socket.
>
> Anyway, how do I turn off this "affinity enforcment" as it is
> more important that a job run with a GPU on its non-affinity socket
> than just wait and not run at all?
>
> Thanks
>
> -- Paul Raines (http://help.nmr.mgh.harvard.edu)
>
>
>
> On Sat, 23 Jan 2021 3:19pm, Chris Samuel wrote:
>
>>  On Saturday, 23 January 2021 9:54:11 AM PST Paul Raines wrote:
>>
>>>  Now rtx-08 which has only 4 GPUs seems to always get all 4 uses.
>>>  But the others seem to always only get half used (except rtx-07
>>>  which somehow gets 6 used so another wierd thing).
>>>
>>>  Again if I submit non-GPU jobs, they end up allocating all hte
>>>  cores/cpus on the nodes just fine.
>>
>>  What does your gres.conf look like for these nodes?
>>
>>  One thing I've seen in the past is where the core specifications for the
>>  GPUs
>>  are out of step with the hardware and so Slurm thinks they're on the wrong
>>  socket.  Then when all the cores in that socket are used up Slurm won't
>>  put
>>  more GPU jobs on the node without the jobs explicitly asking to not do
>>  locality.
>>
>>  One thing I've noticed is that in prior to Slurm 20.02 the documentation
>>  for
>>  gres.conf used to say:
>> 
>> #  If your cores contain multiple threads only the first thread
>> #  (processing unit) of each core needs to be listed.
>>
>>  but that language is gone from 20.02 and later and the change isn't
>>  mentioned
>>  in the release notes for 20.02 so I'm not sure what happened there, the
>>  only
>>  clue is this commit:
>>
>>  https://github.com/SchedMD/slurm/commit/
>>  7461b6ba95bb8ae70b36425f2c7e4961ac35799e#diff-
>>  cac030b65a8fc86123176971a94062fafb262cb2b11b3e90d6cc69e353e3bb89
>>
>>  which says "xcpuinfo_abs_to_mac() expects a core list, not a CPU list."
>>
>>  Best of luck!
>>  Chris
>>  --
>>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>> 
>> 
>> 
>> 
>> 
>> 
>
>
>