[slurm-users] MinCPUsNode in job arrays

Andreas Hilboll hilboll+slurm at uni-bremen.de
Fri Mar 15 06:28:48 UTC 2019


Dear SLURM experts,

I'm having trouble understanding an issue we have with slurm 
17.11.10.

In one partition "all", we have some nodes with hypterthreading 
and
some without, leading to 56 and 28 "cores", respectively.

In the same partition, we have some nodes with 256GM and some with
128GB RAM.  All hypterthreading nodes have 256GB, and some
non-hyperthreading nodes also have 256GB; All 128GB nodes have no
hypterthreading.

Now, when I submit a job array with --ntasks=1 --mem=200G, the all 
the
array's jobs have the MinCPUsNode set to 46, which is roughly 
200/256
* 56.  This leads to the array effectively being limited to that 
  part
of the partition with hypterthreading, which is obviously not what 
I
want.  I don't want MinCPUsNode to be set at all, after all I'm
specifying --ntasks=1.

Is this a bug?  Or am I doing something utterly wrong here?

Cheers,
 Andreas




JobId=270402 ArrayJobId=270402 ArrayTaskId=18 
JobName=calc_vcd_ts.py
  UserId=hilboll(1059) GroupId=hilboll(1059) MCS_label=N/A
  Priority=2909 Nice=0 Account=root QOS=normal
  JobState=PENDING Reason=Resources Dependency=(null)
  Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
  RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
  SubmitTime=2019-03-14T10:49:11 EligibleTime=2019-03-14T10:49:12
  StartTime=2019-03-15T10:49:12 EndTime=2019-03-16T10:49:12
Deadline=N/A
  PreemptTime=None SuspendTime=None SecsPreSuspend=0
  LastSchedEval=2019-03-14T11:04:18
  Partition=all AllocNode:Sid=login1:20705
  ReqNodeList=(null) ExcNodeList=(null)
  NodeList=(null) SchedNodeList=node07
  NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
  TRES=cpu=1,mem=200G,node=1
  Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
  MinCPUsNode=46 MinMemoryNode=200G MinTmpDiskNode=0
  Features=(null) DelayBoot=00:00:00
  Gres=(null) Reservation=(null)
  OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
  Command=/home/hilboll/prj/2018_chochocanada/calc_vcd_ts.py
  WorkDir=/home/hilboll/prj/2018_chochocanada
  StdErr=/home/hilboll/prj/2018_chochocanada/slurm-270402_4294967294.out
  StdIn=/dev/null
  StdOut=/home/hilboll/prj/2018_chochocanada/slurm-270402_4294967294.out
  Power=



More information about the slurm-users mailing list