Hi Guillaume,
as Rob it already mentioned, this could maybe a way for you
(partition just created temporarily online for testing). You could
also add your MaxTRES=node=1 for more restrictions. We do
something similar with QOS to restrict the number of CPU's for
user in certain partitions.
sacctmgr create qos name=maxtrespu200G maxtrespu=mem=200G
flags=denyonlimit
scontrol create partition=testtres qos=maxtrespu200g
maxtime=08:00:00 nodes=lt[10000-10003] DefMemPerCPU=940
MaxMemPerCPU=940 OverSubscribe=NO
That results in:
4 jobs with 100G each:
---
[root@levantetest ~]# squeue
JOBID PARTITION NAME USER ST TIME
NODES NODELIST(REASON)
862 testtres hostname xxxxxxx PD 0:00
1 (QOSMaxMemoryPerUser)
861 testtres hostname xxxxxxx PD 0:00
1 (QOSMaxMemoryPerUser)
860 testtres hostname xxxxxxx R 0:15
1 lt10000
859 testtres hostname xxxxxxx R 0:22
1 lt10000
6 jobs with 50G each:
---
[k202068@levantetest ~]$ squeue
JOBID PARTITION NAME USER ST TIME
NODES NODELIST(REASON)
876 testtres hostname xxxxxxx PD 0:00
1 (QOSMaxMemoryPerUser)
875 testtres hostname xxxxxxx PD 0:00
1 (QOSMaxMemoryPerUser)
874 testtres hostname xxxxxxx R 9:09
1 lt10000
873 testtres hostname xxxxxxx R 9:15
1 lt10000
872 testtres hostname xxxxxxx R 9:22
1 lt10000
871 testtres hostname xxxxxxx R 9:26
1 lt10000
Best Regrads,
Carsten
-- Carsten Beyer Abteilung Systeme Deutsches Klimarechenzentrum GmbH (DKRZ) Bundesstraße 45a * D-20146 Hamburg * Germany Phone: +49 40 460094-221 Fax: +49 40 460094-270 Email: beyer@dkrz.de URL: http://www.dkrz.de Geschäftsführer: Prof. Dr. Thomas Ludwig Sitz der Gesellschaft: Hamburg Amtsgericht Hamburg HRB 39784
> "So if they submit a 2nd job, that job can start but will have to go onto another node, and will again be restricted to 200G? So they can start as many jobs as there are nodes, and each job will be restricted to using 1 node and 200G of memory?"
Yes that's it. We already have MaxNodes=1 so a job can't be spread on multiple nodes.
To be more precise, the limit should be by user and not by job. To illustrate, let's imagine we have 3 empty nodes and a 200G/user/node limit. If a user submit 10 jobs each requesting 100G of memory, there should be 2 jobs running on each worker and 4 jobs pending.
Guillaume
De: "Groner, Rob" <rug262@psu.edu>
À: "Guillaume COCHARD" <guillaume.cochard@cc.in2p3.fr>
Cc: slurm-users@lists.schedmd.com
Envoyé: Mardi 24 Septembre 2024 16:37:34
Objet: Re: Max TRES per user and node
Ah, sorry, I didn't catch that from your first post (though you did say it).
So, you are trying to limit the user to no more than 200G of memory on a single node? So if they submit a 2nd job, that job can start but will have to go onto another node, and will again be restricted to 200G? So they can start as many jobs as there are nodes, and each job will be restricted to using 1 node and 200G of memory? Or can they submit a job asking for 4 nodes, where they are limited to 200G on each node? Or are they limited to a single node, no matter how many jobs?
Rob
From: Guillaume COCHARD <guillaume.cochard@cc.in2p3.fr>
Sent: Tuesday, September 24, 2024 10:09 AM
To: Groner, Rob <rug262@psu.edu>
Cc: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: Re: Max TRES per user and nodeThank you for your answer.
To test it I tried:
sacctmgr update qos normal set maxtresperuser=cpu=2
# Then in slurm.conf
PartitionName=test […] qos=normal
But then if I submit several 1-cpu jobs only two start and the others stay pending, even though I have several nodes available. So it seems that MaxTRESPerUser is a QoS-wide limit, and doesn't limit TRES per user and per node but rather per user and QoS (or rather partition since I applied the QoS on the partition). Did I miss something?
Thanks again,
Guillaume
De: "Groner, Rob" <rug262@psu.edu>
À: slurm-users@lists.schedmd.com, "Guillaume COCHARD" <guillaume.cochard@cc.in2p3.fr>
Envoyé: Mardi 24 Septembre 2024 15:45:08
Objet: Re: Max TRES per user and node
You have the right idea.
On that same page, you'll find MaxTRESPerUser, as a QOS parameter.
You can create a QOS with the restrictions you'd like, and then in the partition definition, you give it that QOS. The QOS will then apply its restrictions to any jobs that use that partition.
Rob
From: Guillaume COCHARD via slurm-users <slurm-users@lists.schedmd.com>
Sent: Tuesday, September 24, 2024 9:30 AM
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Max TRES per user and nodeHello,
We are looking for a method to limit the TRES used by each user on a per-node basis. For example, we would like to limit the total memory allocation of jobs from a user to 200G per node.
There is MaxTRESperNode (https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fsacctmgr.html%23OPT_MaxTRESPerNode&data=05%7C02%7Crug262%40psu.edu%7Ca5ac74d119fb4b1e2a6a08dcdc9d71f4%7C7cf48d453ddb4389a9c1c115526eb52e%7C0%7C0%7C638627815993703402%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=ovXl4if01XtEDBQy3GxOG%2BrpH1GiDYFEOjNtz7gpkUs%3D&reserved=0), but unfortunately, this is a per-job limit, not per user.
Ideally, we would like to apply this limit on partitions and/or QoS. Does anyone know if this is possible and how to achieve it?
Thank you,
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com