There seems to be an issue with the TmpDisk value reporting in Slurm 24.05.3. While the correct value is displayed using the scontrol show nodes command, sinfo appears to report an incorrect value under certain conditions.
For example, the TmpDisk parameter for my compute nodes is configured as 21800000 (21.8 TB) in slurm.conf, and this value is shown correctly with
scontrol show nodes ai-n007
...... State=MIXED ThreadsPerCore=1 TmpDisk=21800000 Weight=1 Owner=N/A MCS_label=N/A ....
However, when using the sinfo command with the -O Disk:20 option, the displayed value is 2180000, which is an order of magnitude lower:
sinfo -hO Disk:20 -n ai-n007
2180000
It appears that sinfo might be truncating or misinterpreting values with 8 or more digits.
Has anyone encountered a similar issue? Is this a known bug in Slurm 24.05.3?
Any suggestions or insights would be greatly appreciated.
Best regards, Gizo
Hi Gizo,
This issue has already been fixed and will be available in 24.11.1. Here is the ticket it was fixed in: https://support.schedmd.com/show_bug.cgi?id=21688
Regards, Megan