[slurm-users] Response to Rémi Palancher about Configuring slurm.conf and using subpartitions
Kratz, Zach
ZKratz at clarku.edu
Wed Oct 4 15:39:48 UTC 2023
Thank you for your response,
Just to clarify,
We do specify the node weight in the node setting lines, I was just wondering if there was a way to be more detailed in our weight assignments.
Here is our configuration right now:
---------------------------------------------------
# COMPUTE NODES - Old nodes have lower weight, so they are preferred before the new ones
NodeName=DEFAULT Sockets=2 ThreadsPerCore=2 RealMemory=257000 CoresPerSocket=16
NodeName=head RealMemory=128000 CoresPerSocket=16
NodeName=node[1-2] RealMemory=257000 CoresPerSocket=10 Weight=10
NodeName=node[3-16] RealMemory=128000 CoresPerSocket=10 Weight=10
NodeName=node17 RealMemory=512000 CoresPerSocket=16 Weight=50
NodeName=node[18-24] RealMemory=257000 CoresPerSocket=16 Weight=50
NodeName=gpu[1-2] RealMemory=128000 CoresPerSocket=10 Gres=gpu:K80:2 Weight=10
NodeName=gpu3 RealMemory=512000 CoresPerSocket=16 Gres=gpu:A30:3 Weight=50
NodeName=gpu4 RealMemory=257000 CoresPerSocket=16 Gres=gpu:A30:2 Weight=50
# QUEUE DEFINITIONS
### CPU Queues ###
# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-cpu-old Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=node[1-16] MaxNodes=4
# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-cpu-old Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=node[1-16] MaxNodes=8
# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-cpu-old Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=node[1-16]
# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-cpu-old Priority=40000 Default=NO MaxTime=1:00:00 State=UP Nodes=node[1-16]
# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-cpu Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=node[17-24] MaxNodes=4
# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-cpu Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=node[17-24] MaxNodes=8
# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-cpu Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=node[17-24]
# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-cpu Priority=40000 Default=YES MaxTime=1:00:00 State=UP Nodes=node[17-24]
### GPU Queues ###
# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-gpu Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=gpu[1-4]
# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-gpu Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=gpu[1-4]
# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-gpu Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=gpu[1-4]
# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-gpu Priority=40000 Default=NO MaxTime=30:00 State=UP Nodes=gpu[1-4]
# Interactive sessions are considered higher priority than batch jobs.
# Specify up to 1/4 of nod node[1-24] interactive session:
# Older nodes have a lower weight so they will be preferred for OOD jobs over new nodes
PartitionName=interactive-cpu Priority=50000 Shared=YES:2 DefaultTime=8:00:00 MaxTime=48:00:00 State=UP Nodes=node[1-24] MaxNodes=1 MaxCPUsPerNode=16 MaxMemPerNode=64000
PartitionName=interactive-gpu Priority=50000 Shared=YES:2 DefaultTime=8:00:00 MaxTime=48:00:00 State=UP Nodes=gpu[1-4] MaxNodes=1 MaxCPUsPerNode=16 MaxMemPerNode=64000
----------------------------------------------------
Notice the weights are set under compute nodes, and under interactive sessions is where it selects from Nodes=node[1-24] to choose what node will complete the interactive job.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231004/4ff7758d/attachment.htm>
More information about the slurm-users
mailing list