[slurm-users] CPU & memory usage summary for a job

Sam Hawarden sam.hawarden at otago.ac.nz
Sun Dec 9 16:14:44 MST 2018


Hi Aravindh

For our small 3 node cluster I've hacked together a per-node python script that collects current and peak cpu, memory and scratch disk usage data on all jobs running on the cluster and builds a fairly simple web-page based on it. It shouldn't be hard to make it store those data points over time, then shove them through an R script to plot the usage:

https://github.com/shawarden/simple-web?

Cheers,
  Sam

________________________________
Sam Hawarden
Assistant Research Fellow
Pathology Department
Dunedin School of Medicine
________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Aravindh Sampathkumar <aravindh at fastmail.com>
Sent: Monday, 10 December 2018 02:39
To: slurm-users at lists.schedmd.com
Subject: [slurm-users] CPU & memory usage summary for a job

Hi All.

I was wondering if anybody has thought of or hacked around a way to record CPU and memory consumption of a job during its entire duration and give a summary of the usage pattern within that job?
Not the MaxRSS and CPU Time that already gets reported for every job.

I'm thinking more like a chart of CPU utilisation, memory usage, and disk usage on a per second basis or something like that.

Asking because some of my users have no clue about the resource consumption of their jobs, and just blindly ask for way more resources as "safe" option. It would be a nice way for users to know simple things like - they asked for 8 cores, but their job ran on just 1 core the entire time because a library they used is single core limited.
We use Cgroups for process accounting and limiting job's cpu and memory usage. We also use QoS for limiting resource reservations at user level.

--
  Aravindh Sampathkumar
  aravindh at fastmail.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181209/74ebf4d9/attachment.html>


More information about the slurm-users mailing list