[slurm-users] Inconsistent cpu bindings with cpu-bind=none
Boden, Marcus Vincent
mboden at gwdg.de
Thu Feb 20 08:45:28 UTC 2020
Hey John,
thanks for the workaround.
After some more testing, I've noticed that this does not occur when using Intel MPI 2018.4, only with the 2019 versions. The slurmd logs show that slurm does not set any binding in that case.
Best,
Marcus
________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Donners, John <john.donners at atos.net>
Sent: Tuesday, February 18, 2020 10:41:29 PM
To: slurm-users at lists.schedmd.com
Subject: Re: [slurm-users] Inconsistent cpu bindings with cpu-bind=none
Hi all,
I have a few more remarks about this question (I have been in contact with Marcus about this):
- the idea of the jobscript is that SLURM does not do any binding and leaves binding up to
mpirun.
- this works fine on the first node, where SLURM does not bind the processes (so mpirun can do this)
- on the second node SLURM uses (faulty) core binding (all processes are bound round-robin to the hyperthreads
of the first core). Intel’s mpirun respects the cpuset and as a result the processes are bound incorrectly.
This looks like a SLURM issue to me. SLURM version 19.05.5 is used.
A workaround is to use I_MPI_PIN_RESPECT_CPUSET=no.
Cheers,
John
Hi everyone,
I am facing a bit of a weird issue with CPU bindings and mpirun:
My jobscript:
#SBATCH -N 20
#SBATCH --tasks-per-node=40
#SBATCH -p medium40
#SBATCH -t 30
#SBATCH -o out/%J.out
#SBATCH -e out/%J.err
#SBATCH --reservation=root_98
module load impi/2019.4 2>&1
export I_MPI_DEBUG=6
export SLURM_CPU_BIND=none
. /sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/mpivars.sh realease
BENCH=/sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/IMB-MPI1
mpirun -np 800 $BENCH -npmin 800 -iter 50 -time 120 -msglog 16:18 -include Allreduce Bcast Barrier Exchange Gather PingPing PingPong Reduce Scatter Allgather Alltoall Reduce_scatter
My output is as follows:
[...]
[0] MPI startup(): 37 154426 gcn1311 {37,77}
[0] MPI startup(): 38 154427 gcn1311 {38,78}
[0] MPI startup(): 39 154428 gcn1311 {39,79}
[0] MPI startup(): 40 161061 gcn1312 {0}
[0] MPI startup(): 41 161062 gcn1312 {40}
[0] MPI startup(): 42 161063 gcn1312 {0}
[0] MPI startup(): 43 161064 gcn1312 {40}
[0] MPI startup(): 44 161065 gcn1312 {0}
[...]
On 8 out of 20 nodes I got the wrong pinning. In the slurmd logs I found
that on nodes, where the pinning was correct, manual binding was
communicated correctly:
lllp_distribution jobid [2065227] manual binding: none
On those, where it did not work, not so much:
lllp_distribution jobid [2065227] default auto binding: cores, dist 1
So, for some reason, slurm told some task to use CPU bindings and for
some, the cpu binding was (correctly) disabled.
Any ideas what could cause this?
Best,
Marcus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200220/407fb5f5/attachment-0001.htm>
More information about the slurm-users
mailing list