[slurm-users] weight setting not working
Andy Leung Yin Sui
moliulay at gmail.com
Tue Mar 12 05:11:33 UTC 2019
Hi,
I am new to slurm and want to use weight option to schedule the jobs.
I have some machine with same hardware configuration with GPU cards. I
use QoS to force user at least required 1 gpu gres when submitting
jobs.
The machine serve multiple partition.
What I want is consume dedicated nodes first when schedule gpu_2h
parition jobs by adding weight settings.(e.g. schedule to GPU38/39
rather than 36/37). However, the scheduler turns out not following the
weight settings and schedule to 36/37 (e.g. srun -p gpu_2h).
All the GPU node are idle and the billing are same, did I miss
something? Was it some limitation if a nodes server multiple partition
or consume GRES? Please advise. Thank you very much.
Below are the setting which may help.
slurm.conf
NodeName=gpu[36-37] Gres=gpu:titanxp:4 ThreadsPerCore=2 State=unknown
Sockets=2 CPUs=40 CoresPerSocket=10 Weight=20
NodeName=gpu[38-39] Gres=gpu:titanxp:4 ThreadsPerCore=2 State=unknown
Sockets=2 CPUs=40 CoresPerSocket=10 Weight=1
PartitionName=gpu_2h Nodes=gpu[36-39] Default=YES MaxTime=02:00:00
DefaultTime=02:00:00 MaxNodes=1 State=UP AllowQos=GPU
PartitionName=gpu_8h Nodes=gpu[31-37] MaxTime=08:00:00
DefaultTime=08:00:00 MaxNodes=1 State=UP AllowQos=GPU
# sinfo -N -O nodelist,partition,gres,weight
NODELIST PARTITION GRES WEIGHT
gpu36 gpu_2h* gpu:titanxp:4 20
gpu36 gpu_8h gpu:titanxp:4 20
gpu37 gpu_2h* gpu:titanxp:4 20
gpu37 gpu_8h gpu:titanxp:4 20
gpu38 gpu_2h* gpu:titanxp:4 1
gpu39 gpu_2h* gpu:titanxp:4 1
More information about the slurm-users
mailing list