[slurm-users] Block jobs on GPU partition when GPU is not specified

Renfro, Michael Renfro at tntech.edu
Sat Sep 25 16:08:15 UTC 2021


If you haven't already seen it there's an example Lua script from SchedMD at [1], and I've got a copy of our local script at [2]. Otherwise, in the order you asked:


  1.  That seems reasonable, but our script just checks if there's a gres at all. I don't *think* any gres other than gres=gpu would let the job run, since our GPU nodes only have Gres=gpu:2 entries. Same thing for asking for more GPUs than are in the node: if someone asked for gres=gpu:3 or higher, the job would get blocked.

The above might be an annoyance to your users if their job just sits in the queue with no other notice, but it hasn't really been an issue here. The big benefit from your side would be that you could simplify the if statement down to something like 'if (job_desc.gres ~= nil)'.

  2.  yes, uncomment JobSubmitPlugins=lua

  3.  Far as I know, if you uncomment the JobSubmitPlugin line and have a job_submit.lua file in the same folder as your slurm.conf, the Lua script should get executed automatically.

  4.  Our RPM installations of Slurm contained the job_submit_lua.so, both for Bright 8 and for OpenHPC.

[1] https://github.com/SchedMD/slurm/blob/master/contribs/lua/job_submit.lua
[2] https://gist.github.com/mikerenfro/df89fac5052a45cc2c1651b9a30978e0

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Ratnasamy, Fritz <fritz.ratnasamy at chicagobooth.edu>
Date: Saturday, September 25, 2021 at 12:23 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [slurm-users] Block jobs on GPU partition when GPU is not specified

External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.

________________________________
Hi,

I would like to block jobs submitted in our GPU partition when gres=gpu:1 (or any number between 1 and 4) is not specified when submitting a job through sbatch or requesting an interactive session with srun.
Currently, /etc/slurm/slurm.conf has JobSumitPlugins=lua commented.
The liblua.so is now installed.
I would like to use something similar as the example mentioned at the end of the page:
https://slurm.schedmd.com/resource_limits.html
<https://slurm.schedmd.com/resource_limits.html%0b>Can I use the following code :


function slurm_job_submit(job_desc, part_list, submit_uid)

   if (job_desc.gres ~= nil)

   then

      for g in job_desc.gres:gmatch("[^,]+")

      do

         bad = string.match(g,'^gpu[:]*[0-9]*$')

         if (bad ~= nil)

         then

            slurm.log_info("User specified gpu GRES without type: %s", bad)

            slurm.user_msg("You must always specify a type when requesting gpu GRES")

            return slurm.ERROR

         end

      end

   end

end
I do not need to check if the model is specified though. In that case,
1/ Should I change the line bad = string.match(g,'^gpu[:]*[0-9]*$') to string.match(g,'^gpu[:]*[0-9]')
2/ Do I need to uncomment  JobSumitPlugins=lua
3/ Where to specify the function call slurm_job_submit so I make sure the check to see if gres=gpu:1 is happening?
4/ I would need job_submit_lua.so, where can I find that library and if it is not there, how can i dowload it?

Thanks for your help. I am new to regular expressions, lua and Slurm so I apologize if my questions do not make sense.


Fritz Ratnasamy
Data Scientist
Information Technology
The University of Chicago
Booth School of Business
5807 S. Woodlawn
Chicago, Illinois 60637
Phone: +(1) 773-834-4556
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210925/54c1a63a/attachment.htm>


More information about the slurm-users mailing list