[slurm-users] CPU & memory usage summary for a job
Carlos Fenoy
minibit at gmail.com
Mon Dec 10 10:54:16 MST 2018
You can also use the influxdb profiling plugin I developed that’s included in the latest slurm version. It will provide live cpu and memory usage per task, step, host and job. You can then provide a grafana dashboard to display the live metrics
Regards,
Carlos
Sent from my iPhone
> On 9 Dec 2018, at 14:39, Aravindh Sampathkumar <aravindh at fastmail.com> wrote:
>
> Hi All.
>
> I was wondering if anybody has thought of or hacked around a way to record CPU and memory consumption of a job during its entire duration and give a summary of the usage pattern within that job?
> Not the MaxRSS and CPU Time that already gets reported for every job.
>
> I'm thinking more like a chart of CPU utilisation, memory usage, and disk usage on a per second basis or something like that.
>
> Asking because some of my users have no clue about the resource consumption of their jobs, and just blindly ask for way more resources as "safe" option. It would be a nice way for users to know simple things like - they asked for 8 cores, but their job ran on just 1 core the entire time because a library they used is single core limited.
> We use Cgroups for process accounting and limiting job's cpu and memory usage. We also use QoS for limiting resource reservations at user level.
>
> --
> Aravindh Sampathkumar
> aravindh at fastmail.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181210/d6fdebdd/attachment-0001.html>
More information about the slurm-users
mailing list