Hello all,
Apologies for the basic question, but is there a straightforward, best-accepted method for using Slurm to report on which GPUs are currently in use? I've done some searching and people recommend all sorts of methods, including parsing the output of nvidia-smi (seems inefficient, especially across multiple GPU nodes), as well as using other tools such as Grafana, XDMoD, etc.
We do track GPUs as a resource, so I'd expect I could get at the info with sreport or something like that, but before trying to craft my own from scratch, I'm hoping someone has something working already. Ultimately I'd like to see either which cards are available by node, or the reverse (which are in use by node). I know recent versions of Slurm supposedly added tighter integration in some way with NVIDIA cards, but I can't seem to find definitive docs on what, exactly, changed or what is now possible as a result.
Warmest regards, Jason
If you do scontrol -d show node it will give what resources are actually being used in more details:
[root@holy8a24507 general]# scontrol show node holygpu8a11101 NodeName=holygpu8a11101 Arch=x86_64 CoresPerSocket=48 CPUAlloc=70 CPUEfctv=96 CPUTot=96 CPULoad=173.07 AvailableFeatures=amd,holyndr,genoa,avx,avx2,avx512,gpu,h100,cc9.0 ActiveFeatures=amd,holyndr,genoa,avx,avx2,avx512,gpu,h100,cc9.0 Gres=gpu:nvidia_h100_80gb_hbm3:4(S:0-15) NodeAddr=holygpu8a11101 NodeHostName=holygpu8a11101 Version=24.11.2 OS=Linux 4.18.0-513.18.1.el8_9.x86_64 #1 SMP Wed Feb 21 21:34:36 UTC 2024 RealMemory=1547208 AllocMem=896000 FreeMem=330095 Sockets=2 Boards=1 MemSpecLimit=16384 State=MIXED ThreadsPerCore=1 TmpDisk=863490 Weight=1442 Owner=N/A MCS_label=N/A Partitions=kempner_requeue,kempner_dev,kempner_h100,kempner_h100_priority,gpu_requeue,serial_requeue
BootTime=2024-10-23T13:10:56 SlurmdStartTime=2025-03-24T14:51:01 LastBusyTime=2025-03-30T15:55:51 ResumeAfterTime=None CfgTRES=cpu=96,mem=1547208M,billing=2302,gres/gpu=4,gres/gpu:nvidia_h100_80gb_hbm3=4 AllocTRES=cpu=70,mem=875G,gres/gpu=4,gres/gpu:nvidia_h100_80gb_hbm3=4 CurrentWatts=0 AveWatts=0
[root@holy8a24507 general]# scontrol -d show node holygpu8a11101 NodeName=holygpu8a11101 Arch=x86_64 CoresPerSocket=48 CPUAlloc=70 CPUEfctv=96 CPUTot=96 CPULoad=173.07 AvailableFeatures=amd,holyndr,genoa,avx,avx2,avx512,gpu,h100,cc9.0 ActiveFeatures=amd,holyndr,genoa,avx,avx2,avx512,gpu,h100,cc9.0 Gres=gpu:nvidia_h100_80gb_hbm3:4(S:0-15) GresDrain=N/A GresUsed=gpu:nvidia_h100_80gb_hbm3:4(IDX:0-3) NodeAddr=holygpu8a11101 NodeHostName=holygpu8a11101 Version=24.11.2 OS=Linux 4.18.0-513.18.1.el8_9.x86_64 #1 SMP Wed Feb 21 21:34:36 UTC 2024 RealMemory=1547208 AllocMem=896000 FreeMem=330095 Sockets=2 Boards=1 MemSpecLimit=16384 State=MIXED ThreadsPerCore=1 TmpDisk=863490 Weight=1442 Owner=N/A MCS_label=N/A Partitions=kempner_requeue,kempner_dev,kempner_h100,kempner_h100_priority,gpu_requeue,serial_requeue
BootTime=2024-10-23T13:10:56 SlurmdStartTime=2025-03-24T14:51:01 LastBusyTime=2025-03-30T15:55:51 ResumeAfterTime=None CfgTRES=cpu=96,mem=1547208M,billing=2302,gres/gpu=4,gres/gpu:nvidia_h100_80gb_hbm3=4 AllocTRES=cpu=70,mem=875G,gres/gpu=4,gres/gpu:nvidia_h100_80gb_hbm3=4 CurrentWatts=0 AveWatts=0
Now it won't give you individual performance of the GPU's, slurm doesn't currently track that in a convenient way like it does cpuload. It will at least give you what has been allocated on the node. We take the nondetailed dump (as it details how many gpus are allocated but not which ones) and throw it into grafana via prometheus to get general cluster stats: https://github.com/fasrc/prometheus-slurm-exporter
If you are looking for performance stats, NVIDIA has a DCGM exporter that we use to pull them and dump them to grafana: https://github.com/NVIDIA/dcgm-exporter
On a per job basis I know people use Weights & Biases but that is code specific: https://wandb.ai/site/ You can also use scontrol -d show job to print out the layout of a job including which specific GPU's were assigned.
-Paul Edmon-
On 4/2/25 9:17 AM, Jason Simms via slurm-users wrote:
Hello all,
Apologies for the basic question, but is there a straightforward, best-accepted method for using Slurm to report on which GPUs are currently in use? I've done some searching and people recommend all sorts of methods, including parsing the output of nvidia-smi (seems inefficient, especially across multiple GPU nodes), as well as using other tools such as Grafana, XDMoD, etc.
We do track GPUs as a resource, so I'd expect I could get at the info with sreport or something like that, but before trying to craft my own from scratch, I'm hoping someone has something working already. Ultimately I'd like to see either which cards are available by node, or the reverse (which are in use by node). I know recent versions of Slurm supposedly added tighter integration in some way with NVIDIA cards, but I can't seem to find definitive docs on what, exactly, changed or what is now possible as a result.
Warmest regards, Jason
-- *Jason L. Simms, Ph.D., M.P.H.* Research Computing Manager Swarthmore College Information Technology Services (610) 328-8102
Hi Jason,
We use the Slurm tool "pestat" (Processor Element status) available from [1] for all kinds of cluster monitoring, including GPU usage. An example usage is:
$ pestat -G -p a100 GPU GRES (Generic Resource) is printed after each JobID Print only nodes in partition a100 Hostname Partition Node Num_CPU CPUload Memsize Freemem GRES/node Joblist State Use/Tot (15min) (MB) (MB) JobID(JobArrayID) User GRES/job ... sd651 a100+ mix 38 128 4.01* 512000 407942 gpu:A100:4 8467943 user1 gpu:a100=1 8480327 user1 gpu:a100=1 8480325 user1 gpu:a100=1 8488029 user2 gpu:a100=1 sd652 a100+ mix 98 128 4.00* 512000 275860 gpu:A100:4 8467942 user1 gpu:a100=1 8488442 user2 gpu:a100=1 8489252 user2 gpu:a100=1 8489253 user2 gpu:a100=1 sd653 a100 mix 8 128 4.00 512000 487001 gpu:A100:4 8480330 user1 gpu:a100=1 8480329 user1 gpu:a100=1 8480328 user1 gpu:a100=1 8480326 user1 gpu:a100=1 sd654 a100 mix 38 128 4.05* 512000 365431 gpu:A100:4 8496110 user3 gpu:a100=1 8480331 user1 gpu:a100=1 8480332 user1 gpu:a100=1 8480333 user1 gpu:a100=1
If you want to find out the GPU usage of a specific job, the "psjob" command from [2] is really handy. An example output is:
$ psjob 8496110 JOBID PARTITION NODES TASKS USER START_TIME TIME TIME_LIMIT TRES_ALLOC 8496110 a100 1 32 user3 2025-04-02T03:43:49 12:22:40 2-00:00:00 cpu=32,mem=112000M,node=1,billing=128,gres/gpu=1,gres/gpu:a100=1 NODELIST: sd654 ==================================================== Process list from 'ps' on each node in the job:psjob 8496110 --------------- sd654 --------------- PID NLWP S USER STARTED TIME %CPU RSS COMMAND 603039 1 S user3 03:43:49 00:00:00 0.0 4304 /bin/bash /var/spool/slurmd/job8496110/slurm_sc 603061 1 S user3 03:43:50 00:00:00 0.0 4152 bash config.sh 603064 5 R user3 03:43:50 12:03:54 97.4 5907868 /home/cat/user3/MACE/mace_env/bin/python3 /ho Total: 3 processes and 7 threads Uptime: 16:06:30 up 12 days, 2:08, 0 users, load average: 4.28, 4.09, 4.02 ==================================================== Nodes in this job with GPU Generic Resources (Gres): sd654 gpu:A100:4
Running GPU tasks: Node: GPU GPU-type | Temp GPU% | Mem / Tot | user:process/PID(Mem) sd654: [2] NVIDIA A100-SXM4-40GB | 33°C, 41 % | 24060 / 40960 MB | user3:python3/603064(24050M) ==================================================== Scratch disk usage for JobID 8496110: Node: Usage Scratch folder sd654: 8.0K /scratch/8496110
Scratch disks on JobID 8496110 compute nodes: Node: Size Used Avail Use% Mounted on sd654: 1.7T 12G 1.7T 1% /scratch ====================================================
The psjob command's prerequisites are listed in the README.md file in [2], namely the "gpustat" and "ClusterShell" tools.
Best regards, Ole
[1] https://github.com/OleHolmNielsen/Slurm_tools/tree/master/pestat [2] https://github.com/OleHolmNielsen/Slurm_tools/tree/master/jobs
On 4/2/25 15:17, Jason Simms via slurm-users wrote:
Apologies for the basic question, but is there a straightforward, best- accepted method for using Slurm to report on which GPUs are currently in use? I've done some searching and people recommend all sorts of methods, including parsing the output of nvidia-smi (seems inefficient, especially across multiple GPU nodes), as well as using other tools such as Grafana, XDMoD, etc.
We do track GPUs as a resource, so I'd expect I could get at the info with sreport or something like that, but before trying to craft my own from scratch, I'm hoping someone has something working already. Ultimately I'd like to see either which cards are available by node, or the reverse (which are in use by node). I know recent versions of Slurm supposedly added tighter integration in some way with NVIDIA cards, but I can't seem to find definitive docs on what, exactly, changed or what is now possible as a result.