[slurm-users] Strange error, submission denied

Thu Feb 21 07:12:38 UTC 2019

ahh, ...

one thing, I forgot. The following is working again ...

--ntasks=24 --ntasks-per-node=24
    NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
    TRES=cpu=48,mem=120000M,energy=63,node=1,billing=48
    Socks/Node=* NtasksPerN:B:S:C=24:0:*:1 CoreSpec=*
    MinCPUsNode=24 MinMemoryNode=120000M MinTmpDiskNode=0

Best
Marcus

On 2/21/19 7:17 AM, Marcus Wagner wrote:
> Hi Andreas,
>
> I'll try to sum this up ;)
>
> First of all, I used now a Broadwell node, so there is no interference 
> with Skylake and SubNuma clustering.
>
> We are using slurm 18.08.5-2
>
> I have configured the node as slurmd -C tells me:
> NodeName=lnm596          Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 
> RealMemory=120000 
> Feature=bwx2650,hostok,hpcwork                        Weight=10430 
> State=UNKNOWN
>
> This is, what slurmctld knows about the node:
> NodeName=lnm596 Arch=x86_64 CoresPerSocket=12
>    CPUAlloc=0 CPUTot=48 CPULoad=0.03
>    AvailableFeatures=bwx2650,hostok,hpcwork
>    ActiveFeatures=bwx2650,hostok,hpcwork
>    Gres=(null)
>    GresDrain=N/A
>    GresUsed=gpu:0
>    NodeAddr=lnm596 NodeHostName=lnm596 Version=18.08
>    OS=Linux 3.10.0-957.5.1.el7.x86_64 #1 SMP Fri Feb 1 14:54:57 UTC 2019
>    RealMemory=120000 AllocMem=0 FreeMem=125507 Sockets=2 Boards=1
>    State=IDLE ThreadsPerCore=2 TmpDisk=0 Weight=10430 Owner=N/A 
> MCS_label=N/A
>    Partitions=future
>    BootTime=2019-02-19T07:43:33 SlurmdStartTime=2019-02-20T12:08:54
>    CfgTRES=cpu=48,mem=120000M,billing=48
>    AllocTRES=
>    CapWatts=n/a
>    CurrentWatts=120 LowestJoules=714879 ConsumedJoules=8059263
>    ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
>
>
> Lets first begin with half of the node:
>
> --ntasks=12 -> 12 CPUs asked. I implicitly get the hyperthread for 
> free (besides the accounting).
>    NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=24,mem=120000M,energy=46,node=1,billing=24
>    Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
>    MinCPUsNode=1 MinMemoryNode=120000M MinTmpDiskNode=0
>
> --ntasks=12 --cpus-per-tasks=2 -> 24 CPUs asked. I now have explicitly 
> asked for 24 CPUs
>    NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=24,mem=120000M,energy=55,node=1,billing=24
>    Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
>    MinCPUsNode=2 MinMemoryNode=120000M MinTmpDiskNode=0
>
> --ntasks=12 --ntasks-per-node=12 --cpus-per-tasks=2 -> 24 CPUs asked. 
> Additional constraint: All 12 tasks should be on one node. I also 
> asked here for 24 CPUs.
>    NumNodes=1 NumCPUs=24 NumTasks=12 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=24,mem=120000M,energy=55,node=1,billing=24
>    Socks/Node=* NtasksPerN:B:S:C=12:0:*:1 CoreSpec=*
>    MinCPUsNode=24 MinMemoryNode=120000M MinTmpDiskNode=0
>
> Everything good up to now. Now I'll try to use the full node:
>
> --ntasks=24 -> 24 CPUs asked, implicitly got 48.
>    NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
>    Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
>    MinCPUsNode=1 MinMemoryNode=120000M MinTmpDiskNode=0
>
> --ntasks=24 --cpus-per-tasks=2 -> 48 CPUs explicitly asked.
>    NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
>    Socks/Node=* NtasksPerN:B:S:C=0:0:*:1 CoreSpec=*
>    MinCPUsNode=2 MinMemoryNode=120000M MinTmpDiskNode=0
>
> And now the funny thing, I don't understand
> --ntasks=24 --ntasks-per-node=24 --cpus-per-tasks=2 -> 48 CPUs asked, 
> all 24 tasks on one node. Slurm tells me:
> sbatch: error: Batch job submission failed: Requested node 
> configuration is not available
>
> I would have expected the following job, which would have fit onto the 
> node:
>    NumNodes=1 NumCPUs=48 NumTasks=24 CPUs/Task=2 ReqB:S:C:T=0:0:*:*
>    TRES=cpu=48,mem=120000M,energy=62,node=1,billing=48
>    Socks/Node=* NtasksPerN:B:S:C=24:0:*:1 CoreSpec=*
>    MinCPUsNode=48 MinMemoryNode=120000M MinTmpDiskNode=0
>
> part of the sbatch -vvv output:
> sbatch: ntasks            : 24 (set)
> sbatch: cpus_per_task     : 2
> sbatch: nodes             : 1 (set)
> sbatch: sockets-per-node  : -2
> sbatch: cores-per-socket  : -2
> sbatch: threads-per-core  : -2
> sbatch: ntasks-per-node   : 24
> sbatch: ntasks-per-socket : -2
> sbatch: ntasks-per-core   : -2
>
> So, again, I see 24 tasks per node, 2 cpus per task and 1 node. This 
> is altogether 48 CPUs on one node. Which fits perfectly, as one can 
> see with the last two examples
> Sprich 24 tasks pro Knoten, 2 cpus pro task, 1 Knoten. Macht bei mir 
> immer noch 48 CPUs.
>
>
> I just ask explicitly what slurm already gives me implicitly, or have 
> I understood something wrong.
>
> We will have to look into this further internally. Might be we have to 
> give up CR_ONE_TASK_PER_CORE.
>
>
> Best
> Marcus
>
> P.S.:
> Sorry for the lengthy post
>
> On 2/20/19 11:59 AM, Henkel wrote:
>> Hi Chris,
>> Hi Marcus,
>>
>> Just want to understand the cause, too. I'll try to sum it up.
>>
>> Chris you have
>>
>> CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2
>>
>> and
>>
>> srun -C gpu -N 1 --ntasks-per-node=80 hostname
>>
>> works.
>>
>> Marcus has configured
>>
>> CPUs=48  Sockets=4 CoresPerSocket=12 ThreadsPerCore=2
>> (slurmd -C says CPUs=96 Boards=1 SocketsPerBoard=4 CoresPerSocket=12
>> ThreadsPerCore=2)
>>
>> and
>>
>> CR_ONE_TASK_PER_CORE
>>
>> and
>>
>> srun -n 48 WORKS
>>
>> srun -N 1 --ntasks-per-node=48 DOESN'T WORK.
>>
>> I'm not sure if it's caused by CR_ONE_TASK_PER_CORE but at least that's
>> one of the major differences. I'm wondering if the effort to force using
>> only physical cores is doubled by removing the 48 Threads AND setting
>> CR_ONE_TAKS_PER_CORE. My impression is that with CR_ONE_TASK_PER_CORE
>> ntasks-per-node accounts for threads (you have set ThreadsPerCore=2),
>> hence only 24 may work but CR_ONE_TASK_PER_CORE doen't affect the
>> selection of 'cores only' with ntasks.
>>
>> We don't use CR_ONE_TASK_PER_CORE but our users either set -c 2 or
>> --hint=nomultithread, which results in core-only.
>>
>> You could also enforce this with a job-submit-plugin or lua-plugin.
>>
>> The fact that CR_ONE_TASK_PER_CORE is described as "under changed" in
>> the public bugs and that there is a non-accessible bug about this
>> probably points to better not use this unless you have to.
>>
>> Best,
>>
>> Andreas
>>
>> On 2/20/19 7:49 AM, Chris Samuel wrote:
>>> On Tuesday, 19 February 2019 10:14:21 PM PST Marcus Wagner wrote:
>>>
>>>> sbatch -N 1 --ntasks-per-node=48 --wrap hostname
>>>> submission denied, got jobid 199805
>>> On one of our 40 core nodes with 2 hyperthreads:
>>>
>>> $ srun -C gpu -N 1 --ntasks-per-node=80 hostname | uniq -c
>>>       80 nodename02
>>>
>>> The spec is:
>>>
>>> CPUs=80 Boards=1 SocketsPerBoard=2 CoresPerSocket=20 ThreadsPerCore=2
>>>
>>> Hope this helps!
>>>
>>> All the best,
>>> Chris
>

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de