[slurm-users] slurmd -C showing incorrect core count
chris at csamuel.org
Fri Mar 13 05:25:55 UTC 2020
On 3/12/20 9:37 PM, Kirill 'kkm' Katsnelson wrote:
> Aaah, that's a cool find! I never really looked inside my nodes for more
> than a year since I debugged all my stuff so it "just works". They are
> conjured out of nothing and dissolve back into nothing after 10 minutes
> of inactivity. But good to know! In the cloud, changing the amount of
> RAM and the number and even type of CPUs is all too easy.
Also on some architectures doing that discovery can take time, so having
it cached can be useful (slurmd will just read it once on startup).
For us that's on a ramdisk filesystem (as Cray XC nodes have no local
disk) so it vanishes every time the node reboots.
My bet is that Mike's nodes have persistent storage and have an old copy
of this file, hence the weird discrepancy he's seeing.
All the best,
Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA
More information about the slurm-users