Hii,
Thanks for your valuable reply! Based on your input, I made the following changes to the system configuration:
Created a new QoS:
Priority: 200
Restriction: 3 nodes, 24 GPUs
Here are the commands I used:
sacctmgr add qos test
sacctmgr modify qos test set priority=200
sacctmgr modify qos test set GrpTRES=cpu=24
sacctmgr modify qos test set GrpTRES=gres/gpu=24,node=3
Attached the QoS to users from different groups as their default QoS.
Created a floating partition with all the nodes from the default partition and attached the same QoS to this partition. The configuration is as follows:
PartitionName=testingp MaxTime=7-0:00:00 DefaultTime=01:00:00 AllowQos=test State=UP Nodes=node1,node2,node3,node4,node5,node5 DefCpuPerGPU=16 MaxCPUsPerNode=192
However, when the users submit their jobs to the testingp partition, they are not receiving the expected priority. Their jobs are stuck in the queue and are not being allocated resources, while users without any priority are able to get resources on the default partition.
Could you please confirm if my setup is correct, or if any modifications are required on my end?
My slurm version is slurm 21.08.6
--
Regards,
Manisha Yadav
> On 01/28/2025 12:44 PM Bjørn-Helge Mevik via slurm-users
slurm-users@lists.schedmd.com wrote:
>
>
> Manisha Yadav via slurm-users
slurm-users@lists.schedmd.com writes:
>
> > To achieve this, I attempted to use QoS by creating a floating
> > partition with some of the nodes and configuring a QoS with
> > priority. I also set a limit with GrpTRES=gres/gpu=24, given that each
> > node has 8 GPUs, and there are 3 nodes in total.
>
> If there are more nodes with GPUs, this will not prevent these users
> from getting GPUs on more than 3 nodes, it will only prevent them from
> getting more than 24 GPUs. It will not prevent them from running
> cpu-only jobs on other nodes either.. I think using
> GrpTRES=gres/gpu=24,node=3 (or perhaps simply GrpTRES=node=3) should
> work.
>
> --
> Regards,
> Bjørn-Helge Mevik, dr. scient,
> Department for Research Computing, University of Oslo
>
>
> --
> slurm-users mailing list -- slurm-users@lists.schedmd.com
> To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
------------------------------------------------------------------------------------------------------------
[ C-DAC is on Social-Media too. Kindly follow us at:
Facebook:
https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
------------------------------------------------------------------------------------------------------------