[slurm-users] Using sharding

4 Jul 2024


      Greetings,
There are not many questions regarding GPU sharding here, and I am unsure
if I am using it correctly... I have configured it according to the
instructions https://slurm.schedmd.com/gres.html, and it seems to be
configured properly:
$ scontrol show node compute01
NodeName=compute01 Arch=x86_64 CoresPerSocket=32
   CPUAlloc=48 CPUEfctv=128 CPUTot=128 CPULoad=10.95
   AvailableFeatures=(null)
   ActiveFeatures=(null)
*   Gres=gpu:8,shard:32*
   [truncated]
When running with gres:gpu everything works perfectly:
$ /usr/bin/srun --gres=gpu:2 ls
srun: job 192 queued and waiting for resources
srun: job 192 has been allocated resources
(...)
However, when using sharding, it just stays waiting indefinitely:
$ /usr/bin/srun --gres=shard:2 ls
srun: job 193 queued and waiting for resources
The reason it gives for pending is just "Resources":
$ scontrol show job 193
JobId=193 JobName=ls
   UserId=rpcruz(1000) GroupId=rpcruz(1000) MCS_label=N/A
   Priority=1 Nice=0 Account=account QOS=normal
*   JobState=PENDING Reason=Resources Dependency=(null)*   Requeue=1
Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=2-00:00:00 TimeMin=N/A
   SubmitTime=2024-06-28T05:36:51 EligibleTime=2024-06-28T05:36:51
   AccrueTime=2024-06-28T05:36:51
   StartTime=2024-06-29T18:13:22 EndTime=2024-07-01T18:13:22 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2024-06-28T05:37:20
Scheduler=Backfill:*
   Partition=partition AllocNode:Sid=localhost:47757
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   ReqTRES=cpu=1,mem=1031887M,node=1,billing=1
   AllocTRES=(null)
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=ls
   WorkDir=/home/rpcruz
   Power=
*   TresPerNode=gres/shard:2*
Again, I think I have configured it properly - it shows up correctly in
scontrol (as shown above).
Our setup is pretty simple - I just added shard to /etc/slurm/slurm.conf:
GresTypes=gpu,shard
NodeName=compute01 Gres=gpu:8,shard:32 [truncated]
Our /etc/slurm/gres.conf is also straight-forward: (it works fine for
--gres=gpu:1)
Name=gpu File=/dev/nvidia[0-7]
Name=shard Count=32
Maybe I am just running srun improperly? Shouldn't it just be srun --gres=
shard:2 to allocate half of a GPU? (since I am using 32 shards for the 8
gpus, so it's 4 shards per gpu)
Thank you very much for your attention,
--
Ricardo Cruz - https://rpmcruz.github.io

2025

2024

[slurm-users] Using sharding