[slurm-users] Sharing a GPU

Bas van der Vlies bas.vandervlies at surf.nl
Wed Apr 13 12:14:51 UTC 2022


Just released a new version of the plugin.  Our cluster has been upgraded to 21.08.6 and the cgroups structure is different. Fixed in latest release:
 * Tested on 21.08 and 20.11

Regards

> On 4 Apr 2022, at 09:20, Bas van der Vlies <bas.vandervlies at surf.nl> wrote:
> 
> We have the exact same request for our GPUS that are not A100 and we have developed a lua plugin for our needs (The new slurm version will also allow the 22.XX). Bu tfor earlier version:
> * https://github.com/basvandervlies/surf_slurm_mps
> 
> 
> 
> On 03/04/2022 23:19, Kamil Wilczek wrote:
>> Hello!
>> I am an administrator of a GPU cluster (Slurm version 19.05.5).
>> Could someone help me a little bit and explain if a single
>> GPU can be shared between multiple users? My experience and
>> documentation tells me that it is not possible. But even after
>> some time Slurm is still a beast to me and I find myself
>> struggling :)
>> * I setup the cluster to assign GPUs on multi-GPU servers
>>   to different users using GRES. This works fine and several
>>   users can work on a multi-GPU machine (--gres=gpu:N/--gpu:N).
>> * But sometimes I have requests to allow a group of students
>>   to work simultaneously, interactively on a small partition,
>>   where there is more users than GPUs. So I thought that maybe
>>   an MPS is a solutions, but the docs says that MPS is a way
>>   to run multiple jobs of *the same* user on a single GPU.
>>   When another user is requesting a GPU by MPS, the job is enqueued
>>   and waiting for the first users' MPS server to finish.
>>   So, this is not a solution for a multi-user, simultaneous/parallel
>>   environment, right?
>> Is there a way to share a GPU between multiple users?
>> The requirement is, say:
>> * 16 users working interactively, simultaneously
>> * 4 GPUs partition
>> Kind Regards
> 
> -- 
> Bas van der Vlies
> | HPCV Supercomputing | Internal Services  | SURF | https://userinfo.surfsara.nl |
> | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 |
> |  bas.vandervlies at surf.nl




More information about the slurm-users mailing list