[slurm-users] Statistics on node utilization?

Paul Edmon pedmon at cfa.harvard.edu
Thu Oct 17 14:00:09 UTC 2019


We have been using:

https://github.com/fasrc/slurm-diamond-collector

For our set up.  Though it gives more of an over all look.  We also use 
this:

https://github.com/fasrc/lsload

-Paul Edmon-

On 10/16/19 4:53 PM, Will Dennis wrote:
> Hi all,
>
> We run a few Slurm clusters here, all using SlurmDBD to store job history info. I also utilize Open XDMoD (http://open.xdmod.org/) to run statistics on the jobs. However, it seems that XDMoD does not provide node utilization statistics, unless my XDMoD isn’t configured somehow to do that… What I’m looking for is numbers of jobs landing on which nodes for a period, and things like numbers of completed jobs, failed jobs, etc. per node. What I’m trying to get a sense of is how loaded up (or in my case, most probably, how unused) the individual nodes are in a cluster.
>
> I have run the command:
> sacct -X -p -o jobid,jobname,start,end,user,partition%-30,nodelist,alloccpus,reqmem,cputime,qos,state,exitcode,AllocTRES%-50 -S 01/01/19 > sacct-parsable-2019.txt
> to get a list of jobs dumped out for the year, sucked it into Excel, and used a PivotTable to get some stats, but that is the long way of doing this… Would like something more dynamic and easier. Anyone have any suggestions?
>
> Thanks,
> Will
>
>
>



More information about the slurm-users mailing list