Hello,
I would like to know if it would be possible to limit, using "sacctmgr", use of a certain type of GPU according the name I have assigned in "gres.conf" file. For example, my small cluster has 3 GPUs nodes sharing 2 GPUs each one. Two of that GPUs are the same model but they are located in different servers. Because of my scenario, I would like to limit users to user only one of that GPU type and not allowing to use both of them.
For example:
* gpu-node-1: * GTX1080 * RTX3080 * gpu-node-2: * GTX750 * GTX680 * gpu-node-3: * RTX2070 * RTX3080
What I want is users could user all of them but simultaniously, a user only could use one of the RTX3080.
Using QoS I have created a new "qos" with "sacctmgr add qos test-gpu-limit MaxTRESPerUser=gres/gpu=1", but with this new QoS, users are limited to use only one GPU, even they need to use different GPUs models. I have tried with "sacctmgr add qos test-gpu-limit MaxTRESPerUser=gres/gpu:RTX3080:1" but system returns this error "sacctmgr: error: slurmdb_format_tres_str: no TRES id found for gres/gpu:RTX3080:1".
So, could be possible to apply limits I want to apply?
Thanks.
-- [cid:image003.jpg@01DABD6A.8DFD6D30]http://www.uab.cat/ Daniel Ruiz Molina Tècnic Mitjà Informàtic
Arquitectura de Computadors i Sistemes Operatius Escola d'Enginyeria
Edifici Q - Despatx QC/3052 - Carrer de les Sitges Campus de la UAB · 08193 Bellaterra (Cerdanyola del Vallès) · Barcelona · Spain
+34 93 581 35 44 www.uab.cathttp://www.uab.cat/ Daniel Ruiz at UABhttps://tinyurl.com/yd95zb8j
[cid:image004.jpg@01DABD6A.8DFD6D30]<www.linkedin.com/in/daniel-ruiz-molina-50a83b27> Aquest missatge s'adreça exclusivament a la persona destinatària i pot contenir informació privada o confidencial. Si l'heu rebut per error, comuniqueu-nos-ho i destruïu-lo, i tingueu present que no teniu autorització per fer-ne cap ús. Abans d'imprimir aquest missatge penseu en el medi ambient.
Gestió Servidors via slurm-users wrote:
What I want is users could user all of them but simultaniously, a user only could use one of the RTX3080.
How about two partitions: One contains only the RTX3080, using the QoS MaxTRESPerUser=gres/gpu=1 and another one with all the other GPUs not having this QoS. Users then submit to both of these partitions.