[slurm-users] new user simple question re sacct output line2
Paddy Doyle
paddy at tchpc.tcd.ie
Wed Nov 14 07:51:00 MST 2018
Hi Matt,
I think you're asking about the difference between Job, Step, and Task.
There's an overview of the job launch system here:
https://slurm.schedmd.com/job_launch.html
..but this stack overflow post actually summarises it nicely:
https://stackoverflow.com/a/46532581
A job consists in one or more steps, each consisting in one or more tasks
each using one or more CPU.
Jobs are typically created with the sbatch command, steps are created with
the srun command, tasks are requested (at the job level or the step level)
with --ntasks and CPUs are requested per task with --cpus-per-task. Note
that jobs submitted with sbatch have one implicit step; the Bash script
itself.
So I think what you're asking about is the implicit step of the bash script
itself. It can show overall stats, such as the overall time elapsed. But it
may indeed have different values to some of the steps (e.g. if you ask for
40 CPUs but then do an 'srun -n 10 something' as one of the steps, then the
AllocCPUS field would be different.
A lot of the fields may well be blank, if you didn't specify certain
parameters in your sbatch file, or if accounting is not enabled, or if
there is no value for that job step.
So in your example below, the '82' is the implicit step for the bash file,
and '82.batch' is the command in the batch file.
Hope that helps.
Paddy
On Wed, Nov 14, 2018 at 01:38:54PM +0000, Matthew Goulden wrote:
> Hi,
>
>
> New to slurm; currently working up to move our system from uge/sge
>
>
> sacct output including the default headers is three lines, What is line 2 documenting? Most fields are blank.
>
>
> For most fields with values these are the same as for line 3:
>
> AllocCPUS,
> Elapsed,
> State,
> ExitCode,
> ReqMem,
>
> For some fields with values these are clearly related to that in line 3 (represented here as line1:line2:line3)
> JobID : 82 : 82.batch
> JobIDRaw : 82 : 82.batch
>
>
> For others the values are uniq to line 2:
>
> JobName : <jobName assigned to -J in sbatch script> : batch
>
> Partition : all_slt_limit :
>
> ReqCPUFreqMin : Unknown : 0
> ReqCPUFreqMax : Unknown : 0
> ReqCPUFreqGov : Unknown : 0
>
> ReqTRES : billing=1,cpu=1,node=1 :
> AllocTRES : billing=1,cpu=1,mem=125000M,node=1 : cpu=1,mem=125000M,node=1
>
>
> I'm sure the documentation - which is excellent - details this but I've not found where; can someone give me the pointer I need?
>
>
> Many thanks
>
>
> Matt
>
> **************************************************************************
> The information contained in the EMail and any attachments is confidential and intended solely and for the attention and use of the named addressee(s). It may not be disclosed to any other person without the express authority of Public Health England, or the intended recipient, or both. If you are not the intended recipient, you must not disclose, copy, distribute or retain this message or any part of it. This footnote also confirms that this EMail has been swept for computer viruses by Symantec.Cloud, but please re-sweep any attachments before opening or saving. http://www.gov.uk/PHE
> **************************************************************************
--
Paddy Doyle
Trinity Centre for High Performance Computing,
Lloyd Building, Trinity College Dublin, Dublin 2, Ireland.
Phone: +353-1-896-3725
http://www.tchpc.tcd.ie/
More information about the slurm-users
mailing list