[slurm-users] Dependencies with singleton and after

Jarno van der Kolk jvanderk at uottawa.ca
Thu Aug 29 12:39:36 UTC 2019


Hi Michael,

Yes, without the singleton it works as expected:
$ sbatch --hold fakejob.sh
Submitted batch job 26636869
$ sbatch --hold fakejob.sh
Submitted batch job 26636870
$ sbatch --hold fakejob.sh
Submitted batch job 26636871
$ scontrol update jobid=26636870 Dependency=after:26636871
$ scontrol update jobid=26636871 Dependency=after:26636869
$ scontrol release 26636869 26636870 26636871
$ squeue -u jarno
          JOBID     USER      ACCOUNT           NAME  ST  TIME_LEFT NODES CPUS       GRES MIN_MEM NODELIST (REASON)
       26636869    jarno def-jarno_cp        fakejob   R       1:35     1    1     (null)    250M cdr650 (None)
       26636871    jarno def-jarno_cp        fakejob   R       1:39     1    1     (null)    250M cdr652 (None)
       26636870    jarno def-jarno_cp        fakejob   R       1:42     1    1     (null)    250M cdr667 (None)

Thanks,
Jarno


Jarno van der Kolk, PhD Phys.
Analyste principal en informatique scientifique | Senior Scientific Computing Specialist
Solutions TI | IT Solutions
Université d’Ottawa | University of Ottawa


________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Michael Di Domenico <mdidomenico4 at gmail.com>
Sent: August 28, 2019 10:26 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Dependencies with singleton and after

Attention : courriel externe | external email

just curious.  if you leave out the singleton, do you get the behavior
as expected?

On Tue, Aug 27, 2019 at 9:42 AM Jarno van der Kolk <jvanderk at uottawa.ca> wrote:
>
> Hi all,
>
> I'm still puzzled by the expected behaviour of the following:
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909273
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909274
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909275
> $ scontrol update jobid=25909273 Dependency=singleton
> $ scontrol update jobid=25909274 Dependency=singleton,after:25909275
> $ scontrol update jobid=25909275 Dependency=singleton,after:25909273
> $ scontrol release 25909273 25909274 25909275
>
> I expected these to be executed as 25909273, 25909275, 25909274. However, it seems that singletons are executed in order of submission so that this leads to a circular dependency. That is, 25909274 depends on 25909275 due to "after", and 25909275 depends on 25909274 due to "singleton" plus order of submission.
>
> From the man page for sbatch, that wasn't really clear to me:
>              singleton
>                      This  job  can begin execution after any previously launched jobs sharing the same
>                      job name and user have terminated.
>
> I'm somewhat interested in creating a patch for this, but before I can look into this, I'll need to know what the expected behaviour is.
> If "launched" means submitted to the queue and preserving order, then I should focus on the circular dependency detection.
> If "launched" means entered the running state without preserving order, then I should focus on the dependency resolving.
>
> Any thoughts on this?
>
> Thanks,
> Jarno
>
> Jarno van der Kolk, PhD Phys.
> Analyste principal en informatique scientifique | Senior Scientific Computing Specialist
> Solutions TI | IT Solutions
> Université d’Ottawa | University of Ottawa
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190829/8eb6b94e/attachment.htm>


More information about the slurm-users mailing list