[slurm-users] MinCPUsNode in job arrays
Andreas Hilboll
hilboll+slurm at uni-bremen.de
Fri Mar 15 06:28:48 UTC 2019
Dear SLURM experts,
I'm having trouble understanding an issue we have with slurm
17.11.10.
In one partition "all", we have some nodes with hypterthreading
and
some without, leading to 56 and 28 "cores", respectively.
In the same partition, we have some nodes with 256GM and some with
128GB RAM. All hypterthreading nodes have 256GB, and some
non-hyperthreading nodes also have 256GB; All 128GB nodes have no
hypterthreading.
Now, when I submit a job array with --ntasks=1 --mem=200G, the all
the
array's jobs have the MinCPUsNode set to 46, which is roughly
200/256
* 56. This leads to the array effectively being limited to that
part
of the partition with hypterthreading, which is obviously not what
I
want. I don't want MinCPUsNode to be set at all, after all I'm
specifying --ntasks=1.
Is this a bug? Or am I doing something utterly wrong here?
Cheers,
Andreas
JobId=270402 ArrayJobId=270402 ArrayTaskId=18
JobName=calc_vcd_ts.py
UserId=hilboll(1059) GroupId=hilboll(1059) MCS_label=N/A
Priority=2909 Nice=0 Account=root QOS=normal
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
SubmitTime=2019-03-14T10:49:11 EligibleTime=2019-03-14T10:49:12
StartTime=2019-03-15T10:49:12 EndTime=2019-03-16T10:49:12
Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2019-03-14T11:04:18
Partition=all AllocNode:Sid=login1:20705
ReqNodeList=(null) ExcNodeList=(null)
NodeList=(null) SchedNodeList=node07
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=200G,node=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=46 MinMemoryNode=200G MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=(null) Reservation=(null)
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=/home/hilboll/prj/2018_chochocanada/calc_vcd_ts.py
WorkDir=/home/hilboll/prj/2018_chochocanada
StdErr=/home/hilboll/prj/2018_chochocanada/slurm-270402_4294967294.out
StdIn=/dev/null
StdOut=/home/hilboll/prj/2018_chochocanada/slurm-270402_4294967294.out
Power=
More information about the slurm-users
mailing list