[slurm-users] Heterogeneous GPU Node MPS

Mon Nov 16 12:22:58 UTC 2020

I read that also, however the RTX cards are not really pre-Volta and when I
run the mps-server, the nvidia-smi tool gives me e.g.
+---------------------------------------------------------------------------
--+
| Processes:                                                       GPU
Memory |
|  GPU       PID   Type   Process name                             Usage
|
|===========================================================================
==|
|    0     20294      C   nvidia-cuda-mps-server
25MiB |
|    1     20294      C   nvidia-cuda-mps-server
25MiB |
|    2     20294      C   nvidia-cuda-mps-server
29MiB |
|    2     20329    M+C   /opt/lammps/lammps-20201029-cuda/bin/lmp
6087MiB |
|    2     20393    M+C   /opt/lammps/lammps-20201029-cuda/bin/lmp
6087MiB |
+---------------------------------------------------------------------------
--+
So, the server is running on every GPU, while slurm schedules only for V100
with --gres=mps:100. Note that the GPU numbering of nvidia-smi and gres.conf
are inverted, so the GPU ID 2 is the V100 here.
The ordering in gres.conf has no effect on this, also commenting the lines
regarding V100 and removing the V100 from node configuration does not change
anything, with --gres=mps still the V100 is used.

With excluding the V100 from the mps server by setting
CUDA_VISIBLE_DEVICES=1,2 (!), one job is running on an rtx card, but a
second job is pending due to not found resources.

> From the NVIDIA docs re: MPS:
> On systems with a mix of Volta / pre-Volta GPUs, if the MPS server is set
to enumerate any Volta GPU, it will discard all pre-Volta GPUs. In other
words, the MPS server will either operate only on the Volta GPUs and expose
Volta capabilities, or operate only on pre-Volta GPUs.
> I'd be curious what happens if you change the ordering (RTX then V100) in
the gres.conf -- would the RTX work with MPS and the V100 would not?

> > On Nov 13, 2020, at 07:23 , Holger Badorreck <h.badorreck at lzh.de> wrote:
> > 
> > Hello,
> >  
> > I have a heterogeneous GPU Node with one V100 and two RTX cards. When I
request resources with --gres=mps:100, always the V100 is chosen, and jobs
are waiting if the V100 is completely allocated, while RTX cards are free.
If I use --gres=gpu:1, also the RTX cards are used. Is something wrong with
the configuration or is it another problem?
> >  
> > The node configuration  in slurm.conf:
> > NodeName=node1 CPUs=48 RealMemory=128530 Sockets=1 CoresPerSocket=24 
> > ThreadsPerCore=2 Gres=gpu:v100:1,gpu:rtx:2,mps:600 State=UNKNOWN
> >  
> > gres.conf:
> > Name=gpu Type=v100      File=/dev/nvidia0
> > Name=gpu Type=rtx          File=/dev/nvidia1
> > Name=gpu Type=rtx          File=/dev/nvidia2
> > Name=mps Count=200      File=/dev/nvidia0
> > Name=mps Count=200      File=/dev/nvidia1
> > Name=mps Count=200      File=/dev/nvidia2
> >  
> > Best regards,
> > Holger