[slurm-users] Inconsistent cpu bindings with cpu-bind=none

Marcus Boden mboden at gwdg.de
Mon Feb 17 08:48:35 UTC 2020


Hi everyone,

I am facing a bit of a weird issue with CPU bindings and mpirun:
My jobscript:
#SBATCH -N 20
#SBATCH --tasks-per-node=40
#SBATCH -p medium40
#SBATCH -t 30 
#SBATCH -o out/%J.out
#SBATCH -e out/%J.err
#SBATCH --reservation=root_98

module load impi/2019.4 2>&1

export I_MPI_DEBUG=6
export SLURM_CPU_BIND=none

. /sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/mpivars.sh realease
BENCH=/sw/comm/impi/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin/IMB-MPI1

mpirun -np 800 $BENCH -npmin 800 -iter 50 -time 120 -msglog 16:18 -include Allreduce Bcast Barrier Exchange Gather PingPing PingPong Reduce Scatter Allgather Alltoall Reduce_scatter

My output is as follows:
[...]
[0] MPI startup(): 37      154426   gcn1311    {37,77}
[0] MPI startup(): 38      154427   gcn1311    {38,78}
[0] MPI startup(): 39      154428   gcn1311    {39,79}
[0] MPI startup(): 40      161061   gcn1312    {0}
[0] MPI startup(): 41      161062   gcn1312    {40}
[0] MPI startup(): 42      161063   gcn1312    {0}
[0] MPI startup(): 43      161064   gcn1312    {40}
[0] MPI startup(): 44      161065   gcn1312    {0}
[...]

On 8 out of 20 nodes I got the wrong pinning. In the slurmd logs I found
that on nodes, where the pinning was correct, manual binding was
communicated correctly:
  lllp_distribution jobid [2065227] manual binding: none
On those, where it did not work, not so much:
  lllp_distribution jobid [2065227] default auto binding: cores, dist 1

So, for some reason, slurm told some task to use CPU bindings and for
some, the cpu binding was (correctly) disabled.

Any ideas what could cause this?

Best,
Marcus
-- 
Marcus Vincent Boden, M.Sc.
Arbeitsgruppe eScience
Tel.:   +49 (0)551 201-2191
E-Mail: mboden at gwdg.de
---------------------------------------
Gesellschaft fuer wissenschaftliche
Datenverarbeitung mbH Goettingen (GWDG)
Am Fassberg 11, 37077 Goettingen
URL:    http://www.gwdg.de
E-Mail: gwdg at gwdg.de
Tel.:   +49 (0)551 201-1510
Fax:    +49 (0)551 201-2150
Geschaeftsfuehrer: Prof. Dr. Ramin Yahyapour
Aufsichtsratsvorsitzender:
Prof. Dr. Christian Griesinger
Sitz der Gesellschaft: Goettingen
Registergericht: Goettingen
Handelsregister-Nr. B 598
---------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5028 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200217/a2568f28/attachment.bin>


More information about the slurm-users mailing list