[slurm-users] [EXTERNAL] Re: How to look for free nodes of a certain constraint efficiently

Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND APPLICATIONS INC] matthew.thompson at nasa.gov
Fri Oct 15 14:53:45 UTC 2021


Ole and Carsten,

Thanks. I was able to use:

  sinfo -O Nodes,Features:30,StateLong

along with some grep/awk/bash to piece something together where I can get a good estimate of idle/allocated nodes.

It's rough (not at the level of Slurm_tools (i.e., showpartitions () but it does enough. Next up all that fun colorizing/options/etc.!

Thanks,
Matt
-- 
Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
NASA GSFC,    Global Modeling and Assimilation Office
Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
Phone: 301-614-6712                 Fax: 301-614-6246
http://science.gsfc.nasa.gov/sed/bio/matthew.thompson

On 10/14/21, 10:39 AM, "slurm-users on behalf of Ole Holm Nielsen" <slurm-users-bounces at lists.schedmd.com on behalf of Ole.H.Nielsen at fysik.dtu.dk> wrote:

    Hi Matt,

    How about this sinfo command:

    $ sinfo -O NodeList:30,Features:30,StateLong
    NODELIST                      AVAIL_FEATURES                STATE 

    i023                          xeon2650v2,infiniband,xeon16  draining@ 

    i[004-022,024-050]            xeon2650v2,infiniband,xeon16  allocated 

    x[001-192]                    xeon2650v4,opa,xeon24         allocated 

    a[001-128]                    xeon6242r,opa,xeon40          allocated 

    b[001-012],c[001-196]         xeon6148v5,opa,xeon40         allocated 


    You can grep for the desired features and use the nodelist in column 1 for 
    further processing.

    /Ole

    On 10/14/21 2:44 PM, Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS AND 
    APPLICATIONS INC] wrote:
    > All,
    > 
    > I work on a cluster that uses SLURM which has various types of nodes that 
    > are are controlled via --constraint flags in sbatch.
    > 
    > Now, I started thinking "How can I figure out how many jobs are 
    > running/pending/etc on a certain type of node?". I first thought obviously 
    > "squeue --constraint=foo", but...nope. No --constraint flag with squeue. 
    > Okay. Constraints are just Features by another name, but...you can't seem 
    > to just squeue a feature either.
    > 
    > I asked a SLURM guru here and they suggested using --nodelist/-w a la:
    > 
    >    squeue -a -w nodea[001-100],nodeb[001-100],... -t r
    > 
    > where you pass in all the nodes of a certain type. And, yep, that works! 
    > But that also means I have to know what nodes are what type. I could 
    > obviously do a one-time parsing of "scontrol show nodes" and see what each 
    > chunk is and be done with it...but dangit I'm lazy and SLURM has so many 
    > programs and options there might just be something and I haven't read the 
    > right manpage! :)
    > 
    > So I was wondering if anyone out there knows of a cool/elegant/efficient 
    > way of doing this?
    > 
    > Thanks,
    > 
    > Matt
    > 
    > PS: I still might write a bash script where I've listed what the node 
    > names are of constraint and realize I might have to update it once every 
    > year or two. Now time to look at what parser SLURM uses for nodelist. Can 
    > you use regexes and use *, etc? Or just use nodea[001-100]? Time to find out!




More information about the slurm-users mailing list