[slurm-users] draining nodes due to failed killing of task?

Adrian Sevcenco Adrian.Sevcenco at spacescience.ro
Fri Aug 6 07:02:45 UTC 2021


Having just implemented some triggers i just noticed this:

NODELIST    NODES PARTITION       STATE CPUS    S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
alien-0-47      1    alien*    draining   48   48:1:1 193324   214030      1 rack-0,4 Kill task failed
alien-0-56      1    alien*     drained   48   48:1:1 193324   214030      1 rack-0,4 Kill task failed

i was wondering why a node is drained when killing of task fails and how can i disable it? (i use cgroups)
moreover, how can the killing of task fails? (this is on slurm 19.05)

Thank you!
Adrian




More information about the slurm-users mailing list