[slurm-users] How to look for free nodes of a certain constraint efficiently

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Thu Oct 14 14:34:24 UTC 2021


Hi Matt,

How about this sinfo command:

$ sinfo -O NodeList:30,Features:30,StateLong
NODELIST                      AVAIL_FEATURES                STATE 

i023                          xeon2650v2,infiniband,xeon16  draining@ 

i[004-022,024-050]            xeon2650v2,infiniband,xeon16  allocated 

x[001-192]                    xeon2650v4,opa,xeon24         allocated 

a[001-128]                    xeon6242r,opa,xeon40          allocated 

b[001-012],c[001-196]         xeon6148v5,opa,xeon40         allocated 


You can grep for the desired features and use the nodelist in column 1 for 
further processing.

/Ole

On 10/14/21 2:44 PM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND 
APPLICATIONS INC] wrote:
> All,
> 
> I work on a cluster that uses SLURM which has various types of nodes that 
> are are controlled via --constraint flags in sbatch.
> 
> Now, I started thinking "How can I figure out how many jobs are 
> running/pending/etc on a certain type of node?". I first thought obviously 
> "squeue --constraint=foo", but...nope. No --constraint flag with squeue. 
> Okay. Constraints are just Features by another name, but...you can't seem 
> to just squeue a feature either.
> 
> I asked a SLURM guru here and they suggested using --nodelist/-w a la:
> 
>    squeue -a -w nodea[001-100],nodeb[001-100],... -t r
> 
> where you pass in all the nodes of a certain type. And, yep, that works! 
> But that also means I have to know what nodes are what type. I could 
> obviously do a one-time parsing of "scontrol show nodes" and see what each 
> chunk is and be done with it...but dangit I'm lazy and SLURM has so many 
> programs and options there might just be something and I haven't read the 
> right manpage! :)
> 
> So I was wondering if anyone out there knows of a cool/elegant/efficient 
> way of doing this?
> 
> Thanks,
> 
> Matt
> 
> PS: I still might write a bash script where I've listed what the node 
> names are of constraint and realize I might have to update it once every 
> year or two. Now time to look at what parser SLURM uses for nodelist. Can 
> you use regexes and use *, etc? Or just use nodea[001-100]? Time to find out!



More information about the slurm-users mailing list