[slurm-users] GPU process accounting information

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Fri Jan 15 12:41:31 UTC 2021


We have installed some new GPU nodes, and now users are asking for some 
sort of monitoring of GPU utilisation and GPU memory utilisation at the 
end of a job, like what Slurm already provides for CPU and memory usage.

I haven't found any pages describing how to perform GPU accounting within 
Slurm, so I would like to ask the user community for some advice on the 
best practices and any available (simple) tools out there.

What I have discovered is that Nvidia provides process accounting using 
nvidia-smi[1].  It is enabled with

$ nvidia-smi --accounting-mode=1

and queried with

$ nvidia-smi 

but the documentation seems quite scant, and so far I don't see any output 
from this query command.

Some questions:

1. Is there a way to integrate the Nvidia process accounting into Slurm?

2. Can users run the above command in the job scripts and get the GPU 
accounting information?


1. https://developer.nvidia.com/nvidia-system-management-interface

More information about the slurm-users mailing list