[slurm-users] How to look for free nodes of a certain constraint efficiently

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] matthew.thompson at nasa.gov
Thu Oct 14 12:44:11 UTC 2021


I work on a cluster that uses SLURM which has various types of nodes that are are controlled via --constraint flags in sbatch.

Now, I started thinking "How can I figure out how many jobs are running/pending/etc on a certain type of node?". I first thought obviously "squeue --constraint=foo", but...nope. No --constraint flag with squeue. Okay. Constraints are just Features by another name, but...you can't seem to just squeue a feature either.

I asked a SLURM guru here and they suggested using --nodelist/-w a la:

  squeue -a -w nodea[001-100],nodeb[001-100],... -t r

where you pass in all the nodes of a certain type. And, yep, that works! But that also means I have to know what nodes are what type. I could obviously do a one-time parsing of "scontrol show nodes" and see what each chunk is and be done with it...but dangit I'm lazy and SLURM has so many programs and options there might just be something and I haven't read the right manpage! :)

So I was wondering if anyone out there knows of a cool/elegant/efficient way of doing this?


PS: I still might write a bash script where I've listed what the node names are of constraint and realize I might have to update it once every year or two. Now time to look at what parser SLURM uses for nodelist. Can you use regexes and use *, etc? Or just use nodea[001-100]? Time to find out!

Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211014/3cdbe76e/attachment.htm>

More information about the slurm-users mailing list