[slurm-users] 17.11+auks+cgroups: finished jobs hang in completing state
R.Eggermont at tudelft.nl
Mon Mar 26 01:36:34 MDT 2018
On 26-03-18 05:04, Christopher Samuel wrote:
> Does the slurmd log report it trying to kill the auks process?
The first thing I need to do is turn up the logging verbosity.
> The fact that auks is hanging around makes me wonder if this is a
> different issue, but you never know..
It's not a 100% match but it's the closest I've found so far. I'll need
to study this some more.
I left a test job hanging last night, and this morning the slurmstepd
was gone, but the auks is still there (orphaned)...
Which is different than last night, when the nodes were drained because
of a batch job failure...
I'll report back when I find out more.
Intelligent Systems Support & Data Steward | TU Delft
+31 15 27 83234 | Building 28, Floor 5, Room W660
Available Mon, Wed-Fri
More information about the slurm-users