[slurm-users] Response to Rémi Palancher about Configuring slurm.conf and using subpartitions

Kratz, Zach ZKratz at clarku.edu
Wed Oct 4 15:39:48 UTC 2023


Thank you for your response,

Just to clarify,
We do specify the node weight in the node setting lines, I was just wondering if there was a way to be more detailed in our weight assignments.

Here is our configuration right now:

---------------------------------------------------
# COMPUTE NODES - Old nodes have lower weight, so they are preferred before the new ones

NodeName=DEFAULT Sockets=2 ThreadsPerCore=2 RealMemory=257000 CoresPerSocket=16
NodeName=head RealMemory=128000 CoresPerSocket=16
NodeName=node[1-2] RealMemory=257000 CoresPerSocket=10 Weight=10
NodeName=node[3-16] RealMemory=128000 CoresPerSocket=10 Weight=10
NodeName=node17 RealMemory=512000 CoresPerSocket=16 Weight=50
NodeName=node[18-24] RealMemory=257000 CoresPerSocket=16 Weight=50
NodeName=gpu[1-2] RealMemory=128000 CoresPerSocket=10 Gres=gpu:K80:2 Weight=10
NodeName=gpu3 RealMemory=512000 CoresPerSocket=16 Gres=gpu:A30:3 Weight=50
NodeName=gpu4 RealMemory=257000 CoresPerSocket=16 Gres=gpu:A30:2 Weight=50


# QUEUE DEFINITIONS

### CPU Queues ###

# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-cpu-old Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=node[1-16] MaxNodes=4

# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-cpu-old Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=node[1-16] MaxNodes=8

# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-cpu-old Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=node[1-16]

# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-cpu-old Priority=40000 Default=NO MaxTime=1:00:00 State=UP Nodes=node[1-16]

# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-cpu Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=node[17-24] MaxNodes=4

# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-cpu Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=node[17-24] MaxNodes=8

# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-cpu Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=node[17-24]

# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-cpu Priority=40000 Default=YES MaxTime=1:00:00 State=UP Nodes=node[17-24]

### GPU Queues ###

# Month-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=month-long-gpu Priority=5000 Default=NO MaxTime=31-0:00:00 State=UP Nodes=gpu[1-4]

# Week-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=week-long-gpu Priority=10000 Default=NO MaxTime=7-0:00:00 State=UP Nodes=gpu[1-4]

# Day-long jobs must be sent here (shorter jobs can be assigned higher priorities below)
PartitionName=day-long-gpu Priority=20000 Default=NO MaxTime=1-0:00:00 State=UP Nodes=gpu[1-4]

# 30-Minute short, high-priority jobs may be sent here
PartitionName=short-gpu Priority=40000 Default=NO MaxTime=30:00 State=UP Nodes=gpu[1-4]


# Interactive sessions are considered higher priority than batch jobs.

# Specify up to 1/4 of nod node[1-24] interactive session:
# Older nodes have a lower weight so they will be preferred for OOD jobs over new nodes
PartitionName=interactive-cpu Priority=50000 Shared=YES:2 DefaultTime=8:00:00 MaxTime=48:00:00 State=UP Nodes=node[1-24] MaxNodes=1 MaxCPUsPerNode=16 MaxMemPerNode=64000

PartitionName=interactive-gpu Priority=50000 Shared=YES:2 DefaultTime=8:00:00 MaxTime=48:00:00 State=UP Nodes=gpu[1-4] MaxNodes=1 MaxCPUsPerNode=16 MaxMemPerNode=64000

----------------------------------------------------

Notice the weights are set under compute nodes, and under interactive sessions is where it selects from Nodes=node[1-24] to choose what node will complete the interactive job.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231004/4ff7758d/attachment.htm>


More information about the slurm-users mailing list