[slurm-users] [EXTERNAL] Re: How to look for free nodes of a certain constraint efficiently
Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC]
matthew.thompson at nasa.gov
Fri Oct 15 14:53:45 UTC 2021
Ole and Carsten,
Thanks. I was able to use:
sinfo -O Nodes,Features:30,StateLong
along with some grep/awk/bash to piece something together where I can get a good estimate of idle/allocated nodes.
It's rough (not at the level of Slurm_tools (i.e., showpartitions () but it does enough. Next up all that fun colorizing/options/etc.!
Thanks,
Matt
--
Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-614-6712 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson
On 10/14/21, 10:39 AM, "slurm-users on behalf of Ole Holm Nielsen" <slurm-users-bounces at lists.schedmd.com on behalf of Ole.H.Nielsen at fysik.dtu.dk> wrote:
Hi Matt,
How about this sinfo command:
$ sinfo -O NodeList:30,Features:30,StateLong
NODELIST AVAIL_FEATURES STATE
i023 xeon2650v2,infiniband,xeon16 draining@
i[004-022,024-050] xeon2650v2,infiniband,xeon16 allocated
x[001-192] xeon2650v4,opa,xeon24 allocated
a[001-128] xeon6242r,opa,xeon40 allocated
b[001-012],c[001-196] xeon6148v5,opa,xeon40 allocated
You can grep for the desired features and use the nodelist in column 1 for
further processing.
/Ole
On 10/14/21 2:44 PM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND
APPLICATIONS INC] wrote:
> All,
>
> I work on a cluster that uses SLURM which has various types of nodes that
> are are controlled via --constraint flags in sbatch.
>
> Now, I started thinking "How can I figure out how many jobs are
> running/pending/etc on a certain type of node?". I first thought obviously
> "squeue --constraint=foo", but...nope. No --constraint flag with squeue.
> Okay. Constraints are just Features by another name, but...you can't seem
> to just squeue a feature either.
>
> I asked a SLURM guru here and they suggested using --nodelist/-w a la:
>
> squeue -a -w nodea[001-100],nodeb[001-100],... -t r
>
> where you pass in all the nodes of a certain type. And, yep, that works!
> But that also means I have to know what nodes are what type. I could
> obviously do a one-time parsing of "scontrol show nodes" and see what each
> chunk is and be done with it...but dangit I'm lazy and SLURM has so many
> programs and options there might just be something and I haven't read the
> right manpage! :)
>
> So I was wondering if anyone out there knows of a cool/elegant/efficient
> way of doing this?
>
> Thanks,
>
> Matt
>
> PS: I still might write a bash script where I've listed what the node
> names are of constraint and realize I might have to update it once every
> year or two. Now time to look at what parser SLURM uses for nodelist. Can
> you use regexes and use *, etc? Or just use nodea[001-100]? Time to find out!
More information about the slurm-users
mailing list