[slurm-users] howto list/get all scripts run by a job?
mercan
ahmet.mercan at uhem.itu.edu.tr
Fri Jun 19 09:52:48 UTC 2020
But don't forget, if there aren't a script you can not get running
script such as salloc jobs.
Ahmet M.
On 19.06.2020 12:39, Adrian Sevcenco wrote:
> On 6/19/20 12:35 PM, mercan wrote:
>> Hi;
>>
>> For running jobs, you can get the running script with using:
>>
>> scontrol write batch_script "$SLURM_JOBID" -
> wow, thanks a lot!!!
>
> Adrian
>
>>
>> command. the - parameter reqired for screen output.
>>
>> Ahmet M.
>>
>>
>> On 19.06.2020 12:25, Adrian Sevcenco wrote:
>>> On 6/18/20 9:35 AM, Loris Bennett wrote:
>>>> Hi Adrain,
>>> Hi
>>>
>>>> Adrian Sevcenco <Adrian.Sevcenco at spacescience.ro> writes:
>>>>
>>>>> Hi! I'm trying to retrieve the actual executable of jobs but i did
>>>>> not find how
>>>>> to do it .. i would like to found this for both case when the job
>>>>> is started
>>>>> with sbatch or with srun.
>>>>
>>>> For running jobs:
>>>>
>>>> scontrol show job <job id>
>>> well, this was the first thing i tried but i have a null command
>>>
>>> [root at alien ~]# scontrol show job 2794270
>>> JobId=2794270 JobName=AliEn.4865.575
>>> UserId=aliprod(1000) GroupId=aliprod(1000) MCS_label=N/A
>>> Priority=13338 Nice=0 Account=aliprod QOS=normal WCKey=*
>>> JobState=RUNNING Reason=None Dependency=(null)
>>> Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
>>> RunTime=09:20:37 TimeLimit=1-00:00:00 TimeMin=N/A
>>> SubmitTime=2020-06-19T02:22:45 EligibleTime=2020-06-19T02:22:45
>>> AccrueTime=2020-06-19T02:22:45
>>> StartTime=2020-06-19T02:32:10 EndTime=2020-06-20T02:32:10
>>> Deadline=N/A
>>> SuspendTime=None SecsPreSuspend=0 LastSchedEval=2020-06-19T02:32:10
>>> Partition=alien AllocNode:Sid=alien.spacescience.ro:4865
>>> ReqNodeList=(null) ExcNodeList=(null)
>>> NodeList=alien-0-62
>>> BatchHost=alien-0-62
>>> NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
>>> TRES=cpu=1,mem=2600M,node=1,billing=1
>>> Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
>>> MinCPUsNode=1 MinMemoryCPU=2600M MinTmpDiskNode=0
>>> Features=(null) DelayBoot=00:00:00
>>> OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
>>> Command=(null)
>>> WorkDir=/tmp
>>> StdErr=/dev/null
>>> StdIn=/dev/null
>>> StdOut=/dev/null
>>> Power=
>>>
>>> this is slurm 19.05.2 and my purpose is only for running jobs
>>>
>>> moreover it's not clear to me what are the steps of job submission
>>> and the processes involved ..
>>>
>>> it seems to be that slurmstepd (with identification of jobid.batch)
>>> start a slurm_script that i think that is the actually submitted script
>>>
>>> the job start a srun with a script (specified in submiited script)
>>> and at the moment i get an slurmstepd[jobid.0] where this is run
>>>
>>> so, at this moment it would be enough if given a job id i can get
>>> the submission script...
>>>
>>> is there a way to do it? (beside get the node from squeue and then
>>> ssh on node and ps grep?)
>>>
>>> Thank you!
>>> Adrian
>>>
>>>
>>>>
>>>> For completed jobs the information about the executable is not kept by
>>>> the standard accounting mechanism. However, it is possible to extract
>>>> more information yourself from either the prolog or epilog and save
>>>> this
>>>> somewhere.
>>>>
>>>> Cheers,
>>>>
>>>> Loris
>>>>
>>>
>
More information about the slurm-users
mailing list