[slurm-users] sacct end time for failed jobs

Brian Andrus toomuchit at gmail.com
Tue Mar 5 18:07:30 UTC 2019


Hmm. I have it as an issue as well as several jobs that are in the db
without an end time, even though they are not running.
Not sure how that happened, but I do want to find a good way to clean it
up. Without and end time, sacct reports the jobs as if they continue to run
and the total elapsed time keeps growing.

Does anyone have a process they use to handle empty (aka "Unknown") end
times for jobs that are not running?

Brian Andrus

On Wed, Feb 27, 2019 at 10:43 PM Chris Samuel <chris at csamuel.org> wrote:

> On Tuesday, 26 February 2019 10:03:34 AM PST Brian Andrus wrote:
>
> > One thing I have noticed is that the END field for jobs with a state of
> > FAILED is "Unknown" but the ELAPSED field has the time it ran.
>
> That shouldn't happen, it works fine here (and where I've used Slurm in
> Australia).
>
> $ sacct -j ${FAILED_JOBID} -o start,end,elapsed,state
>               Start                 End    Elapsed      State
> ------------------- ------------------- ---------- ----------
> 2019-02-27T22:35:23 2019-02-27T22:36:20   00:00:57     FAILED
> 2019-02-27T22:35:23 2019-02-27T22:36:20   00:00:57     FAILED
> 2019-02-27T22:35:23 2019-02-27T22:36:38   00:01:15  COMPLETED
>
> The "COMPLETED" part is the extern step we have as we use pam_slurm_adopt.
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190305/4850aae5/attachment-0001.html>


More information about the slurm-users mailing list