[slurm-users] How to view GPU indices of the completed jobs?

Kota Tsuyuzaki kota.tsuyuzaki.pc at hco.ntt.co.jp
Thu Jun 4 07:57:54 UTC 2020


Hello Guys,

We are running GPU clusters with Slurm and SlurmDBD (version 19.05 series) and some of GPUs seemed to get troubles for attached
jobs. To investigate if the troubles happened on the same GPUs, I'd like to get GPU indices of the completed jobs.

In my understanding `scontrol show job` can show the indices (as IDX in gres info) but cannot be used for completed job. And also
`sacct -j` is available for complete jobs but won't print the indices.

Is there any way (commands, configurations, etc...) to see the allocated GPU indices for completed jobs?

Best regards,

--------------------------------------------
露崎 浩太 (Kota Tsuyuzaki)
kota.tsuyuzaki.pc at hco.ntt.co.jp
NTTソフトウェアイノベーションセンタ
分散処理基盤技術プロジェクト
0422-59-2837
---------------------------------------------







More information about the slurm-users mailing list