[slurm-users] issue with --cpus-per-task=1
Benjamin Glaessle
benjamin.glaessle at uni-tuebingen.de
Thu Mar 10 11:23:27 UTC 2022
Hello all!
we are using slurm 20.11.8 with
> SelectType = select/cons_tres
> SelectTypeParameters = CR_CORE_MEMORY
and nodes with enable hyperthreading, e.g.
> NodeName=slurm-node?? NodeAddr=192.?? Procs=72 Sockets=2 CoresPerSocket=18 ThreadsPerCore=2 RealMemory=...
when launching jobs on these nodes with --cpus-per-task 1 they execute
twice:
> $ srun --cpus-per-task 1 echo foo
> foo
> foo
digging deeper I found
> $ srun --cpus-per-task 1 env | grep -i tasks
> SLURM_NTASKS=2
> SLURM_TASKS_PER_NODE=2
> SLURM_STEP_NUM_TASKS=2
> SLURM_STEP_TASKS_PER_NODE=2
> SLURM_NTASKS=2
> SLURM_TASKS_PER_NODE=2
> SLURM_STEP_NUM_TASKS=2
> SLURM_STEP_TASKS_PER_NODE=2
whereas `scontrol show job 12345 | grep -i -e numtasks -e numcpus` for
both "env" and "echo" job gives
> NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
A test node without ThreadsPerCore=2 behaves "normally".
Also
> $ srun -n1 --cpus-per-task 1 echo foo
> foo
resolves the problem.
This seems like a bug to me.
Can this be reproduced (on newer versions)?
Can this somehow be avoided by setting a default number of tasks or some
other (partition) parameter? Sorry for asking but I couldn't find
anything in the documentation.
Let me know if you need more information.
Best Regards, Benjamin
More information about the slurm-users
mailing list