[slurm-users] slurmd -C showing incorrect core count

Kirill 'kkm' Katsnelson kkm at pobox.com
Fri Mar 13 04:37:05 UTC 2020


On Wed, Mar 11, 2020 at 9:57 PM Chris Samuel <chris at csamuel.org> wrote:

> If so move it out of the way somewhere safe (just in case) and try again.
>

Aaah, that's a cool find! I never really looked inside my nodes for more
than a year since I debugged all my stuff so it "just works". They are
conjured out of nothing and dissolve back into nothing after 10 minutes of
inactivity. But good to know! In the cloud, changing the amount of RAM and
the number and even type of CPUs is all too easy.

Mike, if I were you, I'd probably move out *all* files out of that
directory. Who knows what other surprising mismatches due to the changed
hardware it contains.

I will eat my hat if there is anything of value there--I never had to
prepopulate any directories under /var on a compute node. /etc/slurm/* is
the only Slurm-related thing my cloud rig nodes pulls down from the common
config storage upon boot--the only boot in their lifetime.

That was a very educational problem-solving session, thank you both guys!

 -kkm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200312/23ef0b50/attachment.htm>


More information about the slurm-users mailing list