[slurm-users] Defining new Gres types on nodes
wdennis at nec-labs.com
Mon Sep 24 10:25:05 MDT 2018
We want to add in some Gres resource types pertaining to GPUs (amount of GPU memory and CUDA cores) on some of our nodes. So we added the following params into the 'gres.conf' on the nodes that have GPUs:
And in slurm.conf:
And down in the NodeName lines for these servers:
(where <#> of course is the relevant numerical value)
However, upon restarting the slurmctld on the controller, and the slurmd on the clients, the nodes appear to be unhappy with this, giving a message such as:
Reason=gres/gpu_mem count too low (0 < 4294967296) [root at 2018-09-24T11:36:01]
And of course are then going into DRAIN mode.
We are running Slurm v16.04.5, is doing something like the above a possibility on this version? If so, what could be the problem?
More information about the slurm-users