[slurm-users] Configuring partition limit MaxCPUsPerNode

Michael Gutteridge michael.gutteridge at gmail.com
Mon Nov 26 08:56:31 MST 2018


I'm either misunderstanding how to configure the limit "MaxCPUsPerNode" or
how it behaves.  My desired end-state is that if a user submits a job to a
partition that requests more resources (CPUs) than available on any node in
that partition, the job will be immediately rejected, rather than pending
with state "Resources" as it does now.

In the on-site cluster it works as expected.  I'm running into this problem
in our cloud clusters where I've got power management set up.  When I tear
out all of the cloud/power management configurations the limit works as
expected.

We're using the backfill scheduler and do have QOS's configured (though
none of the QOS's in play have a similar limit applied).  The relevant bits
from the cloud-enabled configuration are:

SelectType=select/cons_res
SelectTypeParameters=CR_Core

PartitionName=campus Default=yes DefaultTime=3-0 MaxTime=7-0
Nodes=nodef[0-69] PreemptMode=off Priority=10000 MaxCPUsPerNode=4
MaxMemPerNode=32000 State=UP
PartitionName=largenode Default=no DefaultTime=1-0 MaxTime=7-0
Nodes=nodeg[0-9] PreemptMode=off Priority=10000 State=UP
PartitionName=gpu Default=no DefaultTime=1-0 MaxTime=7-0 Nodes=nodek[0-9]
PreemptMode=off Priority=10000 State=UP
NodeName=nodef[0-69] CPUS=4 RealMemory=32768 Weight=40 State=CLOUD
NodeName=nodeg[0-9] CPUS=8 RealMemory=262144 Weight=40 State=CLOUD
NodeName=nodek[0-9] gres=gpu:V100-SXM2-16GB:1 CPUS=4 RealMemory=131072
Weight=40 State=CLOUD

Submitting a job that exceeds that limit (e.g. `sbatch -c 12 ...`) results
in a job

JobId=27072660 JobName=wrap
   UserId=me(12345) GroupId=g_me(12345) MCS_label=N/A
   Priority=100012209 Nice=0 Account=hpc QOS=normal
   JobState=PENDING Reason=Resources Dependency=(null)
   Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=3-00:00:00 TimeMin=N/A
   SubmitTime=2018-11-26T07:50:27 EligibleTime=2018-11-26T07:50:27
   AccrueTime=2018-11-26T07:50:27
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   LastSchedEval=2018-11-26T07:53:14
   Partition=campus AllocNode:Sid=cluster-login:13257
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=(null)
   NumNodes=1 NumCPUs=12 NumTasks=1 CPUs/Task=12 ReqB:S:C:T=0:0:*:*
   TRES=cpu=12,node=1,billing=12
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=12 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)

Controller log notes:

slurmctld: _build_node_list: No nodes satisfy JobId=27072660 requirements
in partition campus
slurmctld: _slurm_rpc_submit_batch_job: JobId=27072660 InitPrio=100012209
usec=1037

So there's either the possibility that there's a bug in Slurm, or I'm
misunderstanding how this limit is supposed to work.

Thanks for looking at this- any suggestions greatly appreciated.

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181126/5d4186e6/attachment-0001.html>


More information about the slurm-users mailing list