[slurm-users] derived counters
Heckes, Frank
heckes at mps.mpg.de
Tue Apr 13 12:04:58 UTC 2021
Hello Ole,
> >> -----Original Message-----
> >>> * (average) queue length for a certain partition
>
> I wonder what exactly does your question mean? Maybe the number of jobs or
> CPUs in the Pending state? Maybe relative to the number of CPUs in the
> partition?
>
This result from a mgmt. - question. How long jobs have to wait (in s, min, h, day) before they getting executed and
how many jobs are waiting (are queued) for each partition in a certain time interval.
The first one is easy to find with sacct and submit, start counts + difference + averaging.
The second is a bit cumbersome, so I wonder whether a 'solution' is already around. The easiest way is to monitor from the beginning and store the squeue ouput for later evaluation. Unfortunately I didn’t do that.
Cheers,
-Frank
> The "slurmacct" command prints (possibly for a specified partition) the
> average job waiting time while Pending in the queue, but not the queue length
> information.
>
> It may be difficult to answer your question from the Slurm database. The sacct
> command displays accounting data for all jobs and job steps, but not directly
> for partitions.
>
> There are other Slurm monitoring tools which perhaps can supply the data you
> are looking for. You could ask this list again.
>
> /Ole
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6142 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210413/4947237f/attachment.bin>
More information about the slurm-users
mailing list