[slurm-users] Job not running with Resource Reason even though resources appear to be available

Sun Jan 24 16:39:56 UTC 2021

Thanks Chris.

I think you have identified the issue here or are very close.  My gres.conf on
the rtx-04 node for example is:

AutoDetect=nvml
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia0 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia1 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia2 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia3 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia4 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia5 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia6 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia7 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia8 Cores=0-15
Name=gpu Type=quadro_rtx_8000 File=/dev/nvidia9 Cores=0-15

There are 32 cores (HT is off).  But the daughter card that holds all
10 of the RTX8000s connects to only one socket as can be seen from
'nvidia-smi topo -m'

Its odd though in that my tests on my identically configured
rtx6000 partition did not show that behavior but maybe it is
due to just the "random" cores that got assigned to jobs there
all having a least one core on the "right" socket.

Anyway, how do I turn off this "affinity enforcment" as it is
more important that a job run with a GPU on its non-affinity socket
than just wait and not run at all?

Thanks

-- Paul Raines (http://help.nmr.mgh.harvard.edu)

On Sat, 23 Jan 2021 3:19pm, Chris Samuel wrote:

> On Saturday, 23 January 2021 9:54:11 AM PST Paul Raines wrote:
>
>> Now rtx-08 which has only 4 GPUs seems to always get all 4 uses.
>> But the others seem to always only get half used (except rtx-07
>> which somehow gets 6 used so another wierd thing).
>>
>> Again if I submit non-GPU jobs, they end up allocating all hte
>> cores/cpus on the nodes just fine.
>
> What does your gres.conf look like for these nodes?
>
> One thing I've seen in the past is where the core specifications for the GPUs
> are out of step with the hardware and so Slurm thinks they're on the wrong
> socket.  Then when all the cores in that socket are used up Slurm won't put
> more GPU jobs on the node without the jobs explicitly asking to not do
> locality.
>
> One thing I've noticed is that in prior to Slurm 20.02 the documentation for
> gres.conf used to say:
>
> # If your cores contain multiple threads only the first thread
> # (processing unit) of each core needs to be listed.
>
> but that language is gone from 20.02 and later and the change isn't mentioned
> in the release notes for 20.02 so I'm not sure what happened there, the only
> clue is this commit:
>
> https://github.com/SchedMD/slurm/commit/
> 7461b6ba95bb8ae70b36425f2c7e4961ac35799e#diff-
> cac030b65a8fc86123176971a94062fafb262cb2b11b3e90d6cc69e353e3bb89
>
> which says "xcpuinfo_abs_to_mac() expects a core list, not a CPU list."
>
> Best of luck!
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>
>
>
>
>