[slurm-users] Show detailed information from a finished job
mercan
ahmet.mercan at uhem.itu.edu.tr
Thu Apr 23 09:31:16 UTC 2020
Hi;
I prefer to use epilog script to store the job information to a top
directory owned by the slurm user. To avoid a directory with a lot of
files, It creates a sub-directory for a thousand job file. For a job
which its jobid is 230988, It creates a directory named as 230XXX. Also
the SLURM_JOB_ID of a job array is a problem, because of the slurm uses
an ugly format (298903_[3%1]). Because of these reasons, my script is
little complex, but it works (I crop the other non-relevant things):
#!/bin/bash
if [ "x$SLURM_ARRAY_JOB_ID" != "x" ]
then
JOBNO="${SLURM_ARRAY_JOB_ID}_${SLURM_ARRAY_TASK_ID}"
else
JOBNO="${SLURM_JOB_ID}"
fi
JI=${JOBNO//_*/}
JWIDE=${#JI}
JIDLEN=0
$((JIDLEN=JWIDE-3))
JDIR=/okyanus/SLURM/log/jobs/${JI:0:$JIDLEN}XXX
echo
"==========================================================================="
&>>$JDIR/${JI}.txt
scontrol show job -dd "$JOBNO" &>>$JDIR/${JI}.txt && echo
"==========================================================================="
>>$JDIR/${JI}.txt && scontrol write batch_script "$SLURM_JOBID" -
>>$JDIR/${JI}.txt
exit 0
Regards;
Ahmet M.
23.04.2020 10:33 tarihinde Gestió Servidors yazdı:
>
> Hello,
>
> When a job is “pending” or “running”, with “scontrol show
> jobid=#jobjumber” I can get some usefull information, but when the job
> has finished, that command doesn’t return anything. For example, if I
> run a “sacct” and I see that some jobs have finished with state
> “FAILED”, how can I get detailed information from that job?
>
> Thanks.
>
More information about the slurm-users
mailing list