[slurm-users] Tracking efficiency of all jobs on the cluster (dashboard etc.)

Tina Friedrich tina.friedrich at it.ox.ac.uk
Wed Jul 26 16:41:19 UTC 2023


Hi Will,

I don't, currently, although it's on my list.

However, we had a presentation on a recent Oxford HPC-SIG meeting from a 
colleague, who implemented a simple job profiler that saves a lot of job 
data (including efficiency) & creates plots of the efficiency of the job 
run (in a nutshell). We all thought it sounded interesting :)

Code is here: https://github.com/OxfordCBRG/sps

(it's a spank plugin I believe)

Tina

On 24/07/2023 15:37, Will Furnell - STFC UKRI wrote:
> Hello,
> 
> I am aware of ‘seff’, which allows you to check the efficiency of a 
> single job, which is good for users, but as a cluster administrator I 
> would like to be able to track the efficiency of all jobs from all users 
> on the cluster, so I am able to ‘re-educate’ users that may be running 
> jobs that have terrible resource usage efficiency.
> 
> What do other cluster administrators use for this task? Is there 
> anything you use and recommend (or don’t recommend) or have heard of 
> that is able to do this? Even if it’s something like a Grafana dashboard 
> that hooks up to the SLURM database,
> 
> Thank you,
> 
> Will.
> 



More information about the slurm-users mailing list