[slurm-users] Require help in setting up Priority in slurm

Thu Apr 23 14:22:18 UTC 2020

Dear All:

I want to setup priority queuing in slurm (slurm-18.08.7). Say, one user 
userA has submitted and running 4 jobs from a group USER1-grp; this same 
user userA has also submitted 4 more jobs in PD status. Now userB from 
User2-grp wants to submit job whose job should get top priority rather 
userA's job.

Currently the scheduler will behave as FIFO and no Fairshare policy has 
been implemented yet.

I have gone through this PDF 
<https://slurm.schedmd.com/SLUG19/Priority_and_Fair_Trees.pdf> once and 
studying link1 <https://slurm.schedmd.com/priority_multifactor.html> and 
link2 <https://slurm.schedmd.com/classic_fair_share.html>.

I have already setup the queues :
[root at aneesur ~]# sinfo -v
-----------------------------
dead        = false
exact       = 0
filtering   = false
format      = %9P %.5a %.10l %.6D %.6t %N
iterate     = 0
long        = false
no_header   = false
node_field  = false
node_format = false
nodes       = n/a
part_field  = true
partition   = n/a
responding  = false
states      = (null)
sort        = (null)
summarize   = false
verbose     = 1
-----------------------------
all_flag        = false
alloc_mem_flag  = false
avail_flag      = true
cpus_flag       = false
default_time_flag =false
disk_flag       = false
features_flag   = false
features_flag_act = false
groups_flag     = false
gres_flag       = false
job_size_flag   = false
max_time_flag   = true
memory_flag     = false
partition_flag  = true
port_flag       = false
priority_job_factor_flag   = false
priority_tier_flag   = false
reason_flag     = false
reason_timestamp_flag = false
reason_user_flag = false
reservation_flag = false
root_flag       = false
oversubscribe_flag      = false
state_flag      = true
weight_flag     = false
-----------------------------

Thu Apr 23 19:46:33 2020
sinfo: Consumable Resources (CR) Node Selection plugin loaded with 
argument 1
sinfo: Cray node selection plugin loaded
sinfo: Linear node selection plugin loaded with argument 1
sinfo: Serial Job Resource Selection plugin loaded with argument 1
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
short*       up    1:00:00      9   idle node[1-9]
medium       up 2-00:00:00      9   idle node[1-9]
long         up 4-00:00:00      9   idle node[1-9]
intensive    up 7-00:00:00      9   idle node[1-9]
gpu          up   infinite      4   idle gpu[1-4]

Attaching the slurm.conf file. Any help or guide will genuinely help. I 
know the PDFs and links are best guide but I need to setup and release a 
bit early!

-- 
Thanks & Regards,
Sudeep Narayan Banerjee
System Analyst | Scientist B
Information System Technology Facility
Academic Block 5 | Room 110
Indian Institute of Technology Gandhinagar
Palaj, Gujarat 382355 INDIA

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200423/7f7dda03/attachment-0001.htm>
-------------- next part --------------
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=aneesur
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/cgroup
CacheGroups=0
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
#StateSaveLocation=/var/spool/slurm
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/cgroup
#
#
# TIMERS
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
InactiveLimit=0
Waittime=0
#
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU
SchedulerParameters=assoc_limit_continue
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/slurmdbd
ClusterName=cluster
AccountingStorageEnforce=limits,qos
AccountingStorageTRES=cpu,gres/gpu
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=info
#SlurmctldLogFile=
#SlurmdDebug=info
#SlurmdLogFile=
#PriorityType=priority/multifactor
#PriorityDecayHalfLife=14-0
#PriorityCalcPeriod=0-0:05
#PriorityFavorSmall=NO
#PriorityMaxAge=7-0
#PriorityWeightAge=10000
#PriorityWeightFairshare=100000
#PriorityWeightJobSize=1000
#PriorityWeightPartition=5000
#PriorityWeightQOS=1000
#PriorityWeightTRES=CPU=1000,Mem=2000,GRES/gpu=30000
##
#
# COMPUTE NODES
NodeName=node[1-9] Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 State=IDLE
NodeName=gpu[1-4] Procs=40 Gres=gpu:1 State=IDLE
NodeName=aneesur  Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 State=IDLE
NodeName=aneesur1  Sockets=2 CoresPerSocket=20 ThreadsPerCore=1 Procs=40 State=IDLE
#PartitionName=main Nodes=node[1-9] Default=YES MaxTime=INFINITE State=UP
PartitionName=short Nodes=node[1-9] AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL Default=YES MaxTime=60 MinNodes=1 MaxNodes=1 Priority=1 State=UP
PartitionName=medium Nodes=node[1-9] AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL Default=NO MaxTime=2880 MaxNodes=2 Priority=2 State=UP
PartitionName=long Nodes=node[1-9] AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL Default=NO MaxTime=5760 MaxNodes=2 Priority=3 State=UP
PartitionName=intensive Nodes=node[1-9] AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL Default=NO MaxTime=10080 MaxNodes=4 Priority=4 State=UP
PartitionName=gpu Nodes=gpu[1-4] Default=NO MaxTime=INFINITE State=UP