[slurm-users] issue with --cpus-per-task=1
Benjamin Glaessle
benjamin.glaessle at uni-tuebingen.de
Mon Mar 14 11:29:40 UTC 2022
Hello Diego!
Thanks for the answer.
I knew that the actual behavior is sort of mentioned in the
`--cpus-per-task` documentation.
What puzzled me - and I think is a bug - is that `scontrol show job ...`
shows NumTasks=1, whereas slurmd obviously starts starts 2 tasks.
We have now "resolved our issue" by exporting
> SLURM_NTASKS_PER_NODE=1
for everyone.
Best Regards, Benjamin
Am 10.03.22 um 13:00 schrieb slurm-users-request at lists.schedmd.com:
> Hi Benjamin.
>
> IIUC, what you're seeing is due to multithreading. If you ask for a cpu,
> you get allocated a core, that means it can handle two tasks. Hence the
> "payload" is run twice to fill the two tasks. If you also specify you
> need only one task, it gets run only once (but should still be able to
> multithread and use the second thread on the same core).
>
> Diego
>
> Il 10/03/2022 12:23, Benjamin Glaessle ha scritto:
>> Hello all!
>>
>> we are using slurm 20.11.8 with
>>> SelectType????????????? = select/cons_tres
>>> SelectTypeParameters??? = CR_CORE_MEMORY
>> and nodes with enable hyperthreading, e.g.
>>> NodeName=slurm-node?? NodeAddr=192.?? Procs=72 Sockets=2
>>> CoresPerSocket=18 ThreadsPerCore=2 RealMemory=...
>>
>> when launching jobs on these nodes with --cpus-per-task 1 they execute
>> twice:
>> > $ srun --cpus-per-task 1 echo foo
>> > foo
>> > foo
>>
>> digging deeper I found
>>> $ srun --cpus-per-task 1 env | grep -i tasks
>>> SLURM_NTASKS=2
>>> SLURM_TASKS_PER_NODE=2
>>> SLURM_STEP_NUM_TASKS=2
>>> SLURM_STEP_TASKS_PER_NODE=2
>>> SLURM_NTASKS=2
>>> SLURM_TASKS_PER_NODE=2
>>> SLURM_STEP_NUM_TASKS=2
>>> SLURM_STEP_TASKS_PER_NODE=2
>>
>> whereas `scontrol show job 12345 | grep -i -e numtasks -e numcpus` for
>> both "env" and "echo" job gives
>>> NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
>>
>> A test node without ThreadsPerCore=2 behaves "normally".
>> Also
>> > $ srun -n1 --cpus-per-task 1 echo foo
>> > foo
>> resolves the problem.
>>
>> This seems like a bug to me.
>> Can this be reproduced (on newer versions)?
>>
>> Can this somehow be avoided by setting a default number of tasks or some
>> other (partition) parameter? Sorry for asking but I couldn't find
>> anything in the documentation.
>>
>> Let me know if you need more information.
>>
>> Best Regards, Benjamin
>>
>
More information about the slurm-users
mailing list