Not sure if this is correct but I think you need to leave a bit of RAM for the OS to use so best not to allow slurm to allocate ALL of it. I usually take 8G off to allow for that - negligible when our nodes have at least 768GB of RAM. At least this is my experience when using cgroups.

Hint: round down a bit the RAM reported by 'slurmd -C'. Or you risk the
nodes not coming back up after an upgrade that leaves a bit less free
RAM than configured.

Diego

Il 10/07/2024 17:29, Brian Andrus via slurm-users ha scritto:
> Jack,
>
> To make sure things are set right, run 'slurmd -C' on the node and use
> that output in your config.
>
> It can also give you insight as to what is being seen on the node versus
> what you may expect.
>
> Brian Andrus
>
> On 7/10/2024 1:25 AM, jack.mellor--- via slurm-users wrote:
>> Hi,
>>
>> We are running slurm 23.02.6. Our nodes have hyperthreading disabled
>> and we have slurm.conf set to CPU=32 for each node (each node has 2
>> processes with 16 cores). When we allocated a job, such as salloc -n
>> 32, it will allocate a whole node but using sinfo shows double the
>> allocation in the TRES=64. It also shows in sinfo that the node has
>> 4294967264 idle CPUs.
>>
>> Not sure if its a known bug, or an issue with our config? I have tried
>> various things, like setting the sockets/boards in slurm.conf.
>>
>> Thanks
>> Jack
>>
>

--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786

--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com