[slurm-users] derived counters
heckes at mps.mpg.de
Tue Apr 13 12:04:58 UTC 2021
> >> -----Original Message-----
> >>> * (average) queue length for a certain partition
> I wonder what exactly does your question mean? Maybe the number of jobs or
> CPUs in the Pending state? Maybe relative to the number of CPUs in the
This result from a mgmt. - question. How long jobs have to wait (in s, min, h, day) before they getting executed and
how many jobs are waiting (are queued) for each partition in a certain time interval.
The first one is easy to find with sacct and submit, start counts + difference + averaging.
The second is a bit cumbersome, so I wonder whether a 'solution' is already around. The easiest way is to monitor from the beginning and store the squeue ouput for later evaluation. Unfortunately I didn’t do that.
> The "slurmacct" command prints (possibly for a specified partition) the
> average job waiting time while Pending in the queue, but not the queue length
> It may be difficult to answer your question from the Slurm database. The sacct
> command displays accounting data for all jobs and job steps, but not directly
> for partitions.
> There are other Slurm monitoring tools which perhaps can supply the data you
> are looking for. You could ask this list again.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 6142 bytes
Desc: not available
More information about the slurm-users