[slurm-users] nodes that finished calculation do not become idle

Grigory Ptashko grigory.ptashko at gmail.com
Sun Jun 27 06:59:37 UTC 2021


Hello!

Recently I've started using MPI on our HPC-cluster.
It has 40 nodes.
It runs SLURM.
I'm new to MPI and SLURM but so far everything works fine except one thing.
In short: nodes that finished calculation do not become idle.
Only after all the nodes finished calculations they all become idle.

Here's an example of a typical node:

$ scontrol show nodes cn-022
NodeName=cn-022 Arch=x86_64 CoresPerSocket=18
CPUAlloc=36 CPUTot=36 CPULoad=1.01
AvailableFeatures=(null)
ActiveFeatures=(null)
Gres=(null)
NodeAddr=cn-022 NodeHostName=cn-022 Version=18.08
OS=Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018
RealMemory=1 AllocMem=0 FreeMem=507942 Sockets=2 Boards=1
State=ALLOCATED ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=normal,long,shared
BootTime=2021-06-07T20:45:06 SlurmdStartTime=2021-06-07T20:43:27
CfgTRES=cpu=36,mem=1M,billing=36
AllocTRES=cpu=36,mem=1M,billing=36
CapWatts=n/a
CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s


Here's my sbatch script:

#!/bin/bash
#SBATCH --job-name=robotune
#SBATCH --nodes=36
#SBATCH --ntasks=36
#SBATCH --cpus-per-task=36
#SBATCH --time=5-12:00:00
#SBATCH --output="%x-%N-%j.out"

module purge
module load gnu8/8.3.0
module load mpich/3.3

srun --mpi=pmi2 /home/ptashko/work/robomarket/cmd/tune/robotune <ARGS>


And here's the CPU load of all nodes allocated for this command:

$ scontrol show nodes cn-[005-040] | egrep "CPULoad"
CPUAlloc=36 CPUTot=36 CPULoad=26.53
CPUAlloc=36 CPUTot=36 CPULoad=18.67
CPUAlloc=36 CPUTot=36 CPULoad=4.63
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.00
CPUAlloc=36 CPUTot=36 CPULoad=1.02
CPUAlloc=36 CPUTot=36 CPULoad=0.98
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=0.99
CPUAlloc=36 CPUTot=36 CPULoad=1.02
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=0.99
CPUAlloc=36 CPUTot=36 CPULoad=0.99
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=0.99
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=0.99
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01
CPUAlloc=36 CPUTot=36 CPULoad=1.01


And:

$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 11-00:00:0 37 alloc cn-[001,005-040]
normal* up 11-00:00:0 3 idle cn-[002-004]
long up 31-00:00:0 37 alloc cn-[001,005-040]
long up 31-00:00:0 3 idle cn-[002-004]
shared up infinite 26 alloc cn-[015-040]


So as you see almost all nodes finished calculations (CPULoad 1%).
Only three are working. But those who finished do not become idle!

I want finished nodes to become idle. What I am possibly doing wrong?

Thank you,
Grigory.



More information about the slurm-users mailing list