[slurm-users] Ubuntu20.04 - DRAIN state with SMT in node capabilities

Sven Duscha sven.duscha at tum.de
Wed Nov 24 17:29:50 UTC 2021


Dear all,

a small update.

On 24.11.21 18:13, Sven Duscha wrote:
> So, maybe this wouldn't be a big disadvantage, if that allows us to 
> use 32 slots on the "16 Cores with 2 SMT" Xeons in the PowerEdge R720 
> machines with Ubuntu 20.04
>
>
> Has anyone else encountered this problem? Is there a better/proper for 
> using all SMT/HT cores?


It took about half an hour - with no jobs running, besides some test 
jobs - for the node to fall into "drained" state again:


sinfo -lNe
Wed Nov 24 18:23:05 2021
NODELIST   NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK 
WEIGHT AVAIL_FE REASON
ekgen1         1  cluster*        idle   16    2:8:1 480000 0      1   
(null) none
ekgen2         1  cluster*       mixed   32    2:8:2 250000 0      1   
(null) none
ekgen3         1    debian        idle   32    2:8:2 250000 0      1   
(null) none
ekgen4         1  cluster*       mixed   32    2:8:2 250000 0      1   
(null) none
ekgen5         1  cluster*        idle   32    2:8:2 250000 0      1   
(null) none
ekgen6         1    debian        idle   32    2:8:2 250000 0      1   
(null) none
ekgen7         1  cluster*        idle   32    2:8:2 250000 0      1   
(null) none
ekgen8         1    debian     drained   32   2:16:1 250000 0      1   
(null) Low socket*core*thre
ekgen9         1  cluster*        idle   32    2:8:2 192000 0      1   
(null) none


Thus,

NodeName=ekgen[8] RealMemory=250000 Sockets=2 CoresPerSocket=16 
ThreadsPerCore=1 State=UNKNOWN

isn't a working node declaration either.


The question remains why a declaration matching the output of slurmd -C 
doesn't work with Ubuntu-20.04


P.S.: Fixed version typo in the subject.


-- 
Sven Duscha
Deutsches Herzzentrum München
Technische Universität München
Lazarettstraße 36
80636 München
+49 89 1218 2602

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5463 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211124/0b2e7644/attachment-0001.bin>


More information about the slurm-users mailing list