[slurm-users] Slurm 20.02.3 error: CPUs=1 match no Sockets, Sockets*CoresPerSocket or Sockets*CoresPerSocket*ThreadsPerCore. Resetting CPUs.

mercan ahmet.mercan at uhem.itu.edu.tr
Tue Jun 16 09:27:40 UTC 2020


Hi;

Did you check /var/log/messages file for errors. Systemctl logs this 
file, instead of the slurmctl log file.

Ahmet M.


16.06.2020 11:12 tarihinde Ole Holm Nielsen yazdı:
> Today we upgraded the controller node from 19.05 to 20.02.3, and 
> immediately all Slurm commands (on the controller node) give error 
> messages for all partitions:
>
> # sinfo --version
> sinfo: error: NodeNames=a[001-140] CPUs=1 match no Sockets, 
> Sockets*CoresPerSocket or Sockets*CoresPerSocket*ThreadsPerCore. 
> Resetting CPUs.
> (lines deleted)
> slurm 20.02.3
>
> In slurm.conf we have defined NodeName like:
>
> NodeName=a[001-140] Weight=10001 Boards=1 SocketsPerBoard=2 
> CoresPerSocket=4 ThreadsPerCore=1 ...
>
> According to the slurm.conf manual the CPUs should then be calculated 
> automatically:
>
> "If CPUs is omitted, its default will be set equal to the product of 
> Boards, Sockets, CoresPerSocket, and ThreadsPerCore."
>
> Has anyone else seen this error with Slurm 20.02?
>
> I wonder if there is a problem with specifying SocketsPerBoard in 
> stead of Sockets?  The slurm.conf manual doesn't seem to prefer one 
> over the other.
>
> I've opened a bug https://bugs.schedmd.com/show_bug.cgi?id=9241
>
> Thanks,
> Ole
>
>



More information about the slurm-users mailing list