[slurm-users] addressing NVIDIA MIG + non MIG devices in Slurm

Matthias Leopold matthias.leopold at meduniwien.ac.at
Thu Jan 27 15:27:21 UTC 2022


Hi,

we have 2 DGX A100 systems which we would like to use with Slurm. We 
want to use the MIG feature for _some_ of the GPUs. As I somehow 
suspected I couldn't find a working setup for this in Slurm yet. I'll 
describe the configuration variants I tried after creating the MIG 
instances, it might be a longer read, please bear with me.

1. using slurm-mig-discovery for gres.conf 
(https://gitlab.com/nvidia/hpc/slurm-mig-discovery)
- CUDA_VISIBLE_DEVICES: list of indices
-> seems to bring a working setup and full flexibility at first, but 
when taking a closer look the selection of GPU devices is completely 
unpredictable (output of nvidia-smi inside Slurm job)

2. using "AutoDetect=nvml" in gres.conf (Slurm docs)
- CUDA_VISIBLE_DEVICES: MIG format (see 
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars)

2.1 converting ALL GPUs to MIG
- also a full A100 is converted to a 7g.40gb MIG instance
- gres.conf: "AutoDetect=nvml" only
- slurm.conf Node Def: naming all MIG types (read from slurmd debug log)
-> working setup
-> problem: IPC (MPI) between MIG instances not possible, this seems to 
be a by-design limitation

2.2 converting SOME GPUs to MIG
- some A100 are NOT in MIG mode

2.2.1 using "AutoDetect=nvml" only (Variant 1)
- slurm.conf Node Def: Gres with and without type
-> problem: fatal: _foreach_slurm_conf: Some gpu GRES in slurm.conf have 
a type while others do not (slurm_gres->gres_cnt_config (26) > tmp_count 
(21))

2.2.2 using "AutoDetect=nvml" only (Variant 2)
- slurm.conf Node Def: only Gres without type (sum of MIG + non MIG)
-> problem: different GPU types can't be requested

2.2.3 using partial "AutoDetect=nvml"
- gres.conf: "AutoDetect=nvml" + hardcoding of non MIG GPUs
- slurm.conf Node Def: MIG + non MIG Gres types
-> produces a "perfect" config according to slurmd debug log
-> problem: the sanity-check mode of "AutoDetect=nvml" prevents 
operation (?)
-> Reason=gres/gpu:1g.5gb count too low (0 < 21) [slurm at 2022-01-27T11:23:59]

2.2.4 using static gres.conf with NVML generated config
- using a gres.conf with NVML generated config where I can define the 
type for non MIG GPU and also set the UniqueId for MIG instances would 
be the perfect solution
- slurm.conf Node Def: MIG + non MIG Gres types
-> problem: it doesn't work
-> Parsing error at unrecognized key: UniqueId

Thanks for reading this far. Am I missing something? How can MIG and non 
MIG devices be addressed in a cluster? This setup of having MIG and non 
MIG devices can't be exotic, since having ONLY MIG devices has severe 
disadvantages (see 2.1). Thanks again for any advice.

Matthias



More information about the slurm-users mailing list