[slurm-users] Reservation to exceed time limit on a partition for a user

Thu Jan 3 09:44:25 MST 2019

Hi Matthew,

we use QOS for this and add it to the SLURM user, who needs to exceed 
the partition time limit. You can also set a time limit in the QOS, so 
that a user cannot exceed the 'limits' too much. Example from our system 
with 8 hour runlimit per job:

# grep -i qos slurm.conf
PriorityWeightQOS=1000
AccountingStorageEnforce=limits,qos
#

# sacctmgr -s show qos ch0636 format=Name,Flags%32,MaxWall
       Name                            Flags     MaxWall
---------- -------------------------------- -----------
     ch0636   DenyOnLimit,PartitionTimeLimit    12:00:00
#
# scontrol show part compute
PartitionName=compute
    AllowGroups=ALL DenyAccounts=bmx825,mh1010 AllowQos=ALL
    AllocNodes=ALL Default=NO QoS=N/A
    DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 
Hidden=NO
    MaxNodes=512 MaxTime=08:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
    Nodes=m[10000-11420,11440-11545,11553-11577]
    PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO 
OverSubscribe=EXCLUSIVE
    OverTimeLimit=NONE PreemptMode=OFF
    State=UP TotalCPUs=74496 TotalNodes=1552 SelectTypeParameters=NONE
    DefMemPerCPU=1280 MaxMemPerCPU=5360

With the flag PartitionTimeLimit a user can override the MaxTime of a 
partition and with DenyOnLimit you can limit the maxwall of the job. The 
QOS is added to the user like this:

# sacctmgr -s show user foo format=User,Account,MaxJobs,QOS%30
       User    Account MaxJobs QOS
---------- ---------- ------- ------------------------------
        foo     ch0636      20 ch0636,express,normal
        foo  noaccount       0 normal
        foo     mh0469      20 express,normal
#

The user also needs to add the QOS with '#SBATCH --qos=ch0636' in his 
job description.

Cheers,
Carsten
-- 

Carsten Beyer
Abteilung Systeme

Deutsches Klimarechenzentrum GmbH (DKRZ)
Bundesstraße 45a * D-20146 Hamburg * Germany

Phone:  +49 40 460094-221
Fax:    +49 40 460094-270
Email:  beyer at dkrz.de
URL:    http://www.dkrz.de

Geschäftsführer: Prof. Dr. Thomas Ludwig
Sitz der Gesellschaft: Hamburg
Amtsgericht Hamburg HRB 39784

On 03.01.2019 16:23, Matthew BETTINGER wrote:
> Answering my own question here.  I created a hidden partition which shows like this
>
> PartitionName=FOO
>     AllowGroups=ALL AllowAccounts=rt AllowQos=ALL
>     AllocNodes=ALL Default=NO QoS=N/A
>     DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=YES GraceTime=0 Hidden=YES
>     MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
>     Nodes=nid00[192-255]
>     PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=NO
>     OverTimeLimit=NONE PreemptMode=OFF
>     State=UP TotalCPUs=1280 TotalNodes=64 SelectTypeParameters=NONE
>     DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
>
> I can run jobs in there but if I set it to just a user (myself) then the job does not run.  I may have to just make this partition like this until I can figure out the correct way since we need this to run today.
>
> On 1/3/19, 8:41 AM, "Matthew BETTINGER" <matthew.bettinger at external.total.com> wrote:
>
>      Hello,
>      
>      We are running slurm 17.02.6 with accounting on a cray CLE system.
>      
>      We currently have a 24 hour job run limit on our partitions and a user needs to run a job which will exceed 24 hours runtime.  I tried to make a reservation as seen below allocating the user 36 hours to run his job but it was killed at the 24 hour run limit.  Can someone explain what is going on and what is the proper way to allow a user to exceed the partition time limit without having to modify slurm.conf and push it out to all of the nodes, run ansible plays  and reconfigure etc.  I thought that this is what reservation was for.
>      
>      Here is the reservation I created that failed when it ran over 24 hours
>      
>      scontrol show res
>      ReservationName=CoolBreeze StartTime=2018-12-27T10:08:11 EndTime=2018-12-28T22:08:11 Duration=1-12:00:00
>      Nodes=nid00[192-239] NodeCnt=48 CoreCnt=480 Features=(null) PartitionName=GPU Flags=
>      TRES=cpu=960
>      Users=coolbreeze Accounts=(null) Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a
>      
>      Here is the partition with the resources the user needs to run on
>      
>      PartitionName=GPU
>         AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
>         AllocNodes=ALL Default=NO QoS=N/A
>         DefaultTime=01:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
>         MaxNodes=UNLIMITED MaxTime=1-00:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED
>         Nodes=nid00[192-255]
>         PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=EXCLUSIVE
>         OverTimeLimit=NONE PreemptMode=OFF
>         State=UP TotalCPUs=1280 TotalNodes=64 SelectTypeParameters=NONE
>         DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
>      
>      Thanks!
>      
>      
>      
>      
>