[slurm-users] Sharing a GPU

Kamil Wilczek kmwil at mimuw.edu.pl
Tue Apr 5 10:01:17 UTC 2022

Thank you all for the help!
The plugin seems to be thing I'm looking for.
I'll try to test it with a spare server/GPUs.

Thank again!
Kamil Wilczek

W dniu 04.04.2022 o 09:20, Bas van der Vlies pisze:
> We have the exact same request for our GPUS that are not A100 and we 
> have developed a lua plugin for our needs (The new slurm version will 
> also allow the 22.XX). Bu tfor earlier version:
>   * https://github.com/basvandervlies/surf_slurm_mps
> On 03/04/2022 23:19, Kamil Wilczek wrote:
>> Hello!
>> I am an administrator of a GPU cluster (Slurm version 19.05.5).
>> Could someone help me a little bit and explain if a single
>> GPU can be shared between multiple users? My experience and
>> documentation tells me that it is not possible. But even after
>> some time Slurm is still a beast to me and I find myself
>> struggling :)
>> * I setup the cluster to assign GPUs on multi-GPU servers
>>    to different users using GRES. This works fine and several
>>    users can work on a multi-GPU machine (--gres=gpu:N/--gpu:N).
>> * But sometimes I have requests to allow a group of students
>>    to work simultaneously, interactively on a small partition,
>>    where there is more users than GPUs. So I thought that maybe
>>    an MPS is a solutions, but the docs says that MPS is a way
>>    to run multiple jobs of *the same* user on a single GPU.
>>    When another user is requesting a GPU by MPS, the job is enqueued
>>    and waiting for the first users' MPS server to finish.
>>    So, this is not a solution for a multi-user, simultaneous/parallel
>>    environment, right?
>> Is there a way to share a GPU between multiple users?
>> The requirement is, say:
>> * 16 users working interactively, simultaneously
>> * 4 GPUs partition
>> Kind Regards

Kamil Wilczek  [https://keys.openpgp.org/]
Laboratorium Komputerowe
Wydział Matematyki, Informatyki i Mechaniki
Uniwersytet Warszawski

ul. Banacha 2
02-097 Warszawa

Tel.: 22 55 44 392
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220405/68ccb15e/attachment.sig>

More information about the slurm-users mailing list