[slurm-users] Strange error, submission denied

Marcus Wagner wagner at itc.rwth-aachen.de
Fri Feb 15 05:25:38 UTC 2019


Hi Chris,

that can't be right, or there is some bug elsewhere:

We have configured CR_ONE_TASK_PER_CORE, so two tasks won't get a core 
and its hyperthread.
According to your  theory, I configured 48 threads. But then using just 
--ntasks=48 would give me two nodes, right?

But Slurm schedules these 48 tasks onto one node:

    NumNodes=1 NumCPUs=48 NumTasks=48 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    TRES=cpu=48,mem=182400M,node=1,billing=48

Here you can also see, that CPUs/Task=1, so really the tasks are scheduled.
Essentially, --ntasks=48 --ntask-per-node=48 should do the same. 
Obviously they don't, because in this case submission gets denied.
Nonetheless, you can see in the cgroups and the binding, which is done 
by the task affinity plugin, that every tasks not only gets a core, but 
also its hyperthread.

I think I'll have to file a bug at SchedMD.


Best
Marcus

On 2/14/19 5:35 PM, Christopher Samuel wrote:
> On 2/14/19 12:22 AM, Marcus Wagner wrote:
>
>> CPUs=96 Boards=1 SocketsPerBoard=4 CoresPerSocket=12 ThreadsPerCore=2 
>> RealMemory=191905
>
> That's different to what you put in your config in the original email 
> though.  There you had:
>
> CPUs=48  Sockets=4 CoresPerSocket=12 ThreadsPerCore=2
>
> This config tells Slurm there are just 24 cores for a total of 48 
> threads. Try updating your config with what slurmd detected and see if 
> that helps.
>
> All the best,
> Chris
>

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de




More information about the slurm-users mailing list