[slurm-users] Observation on SchedMD issue 6787: Add EndTime, CompletingTime to output of `scontrol completing`

Kevin Buckley Kevin.Buckley at pawsey.org.au
Mon Oct 28 09:53:00 UTC 2019


Hi there,

in SchedMD issue 6787 (https://bugs.schedmd.com/show_bug.cgi?id=6787),
there was a patch, supplied by Doug Jacobsen, that altered the output
of `scontrol completing` to be akin to the following (have cut-and-pasted
Chris Samuel's example from the issue ticket) when run from the command line:

> scontrol completing
JobId=1137366 EndTime=2019-04-08T16:45:18 CompletingTime=00:00:02 Nodes(COMPLETING)=nid00028
JobId=1137367 EndTime=2019-04-08T16:45:18 CompletingTime=00:00:02 Nodes(COMPLETING)=nid00028
...
>

and that patch went into 19.05.0pre4 and above.
  
Having just patched an 18.08.6-2 scontrol, I have, however,
noticed a couple of things that can make the parsing of the
"default" output slightly less useful in some cases, but
more useful in others, to whit, here's a job, stuck in a
CG state overnight, before I had "fully developed" what I
was doing with the stuff, as run from the command line:


JobId=7909805 EndTime=Ystday 23:07 CompletingTime=19:13:40 Nodes(COMPLETING) = nid00023

JobId=7909805 EndTime=23 Oct 23:07 CompletingTime=1-09:36:30 Nodes(COMPLETING) = nid00023


and here's output from an interrogation of a different "stuck" job,
(on a different machine, for those wondering about the smaller JobID,
in the later job) as run from the command line and when run from inside
a Cron job:


JobId=4167786 EndTime=13:42:42 CompletingTime=02:41:52 Nodes(COMPLETING)=nid00112

JobId=4167786 EndTime=2019-10-28T13:42:42 CompletingTime=03:09:21 Nodes(COMPLETING)=nid00112


The thing to take away here is that if SLURM_TIME_FORMAT is set to
"relative", then you could get get three slightly different EndTime
values, vis:

EndTime=23:07:01
EndTime=Ystday 23:07
EndTime=23 Oct 23:07

to parse, so unsetting SLURM_TIME_FORMAT might be a good thing to do,
before looking to automate things around the output.

Then again, sorting out jobs stuck in a CG state, before they get to be
a day old, would probably be a good thing too!

Hoping that helps someone,
Kevin

-- 
Supercomputing Systems Administrator
Pawsey Supercomputing Centre



More information about the slurm-users mailing list