[slurm-users] Dependencies with singleton and after

Jarno van der Kolk jvanderk at uottawa.ca
Thu Aug 22 13:23:24 UTC 2019


Hi Brian,

Thanks for the suggestion. I tried it out, but I got the same result.

$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122715
$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122716
$ sbatch --hold --dependency=singleton ./fakejob.sh 
Submitted batch job 26122720
$ scontrol update jobid=26122716 Dependency=singleton,after:26122720
$ scontrol update jobid=26122720 Dependency=singleton,after:26122715
$ scontrol release 26122715 26122716 26122720

... waiting for job 26122715 to complete ...

squeue -u jarno
          JOBID     USER      ACCOUNT           NAME  ST  TIME_LEFT NODES CPUS       GRES MIN_MEM NODELIST (REASON) 
       26122716    jarno def-jarno_cp        fakejob  PD       2:00     1    1     (null)    250M  (Dependency) 
       26122720    jarno def-jarno_cp        fakejob  PD       2:00     1    1     (null)    250M  (Dependency) 


Jarno van der Kolk, PhD Phys.
Analyste principal en informatique scientifique | Senior Scientific Computing Specialist
Solutions TI | IT Solutions
Université d’Ottawa | University of Ottawa



From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Brian Andrus <toomuchit at gmail.com>
Sent: August 21, 2019 5:26 PM
To: slurm-users at lists.schedmd.com <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Dependencies with singleton and after

Have you tried adding the dependency at submit time?

sbatch --dependency=singleton fakejob.sh

Brian Andrus


On 8/21/2019 1:51 PM, Jarno van der Kolk wrote:
> Hi,
>
> I am helping a researcher who encountered an unexpected behaviour with dependencies. He uses both "singleton" and "after". he minimal working example is as follows:
>
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909273
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909274
> $ sbatch --hold fakejob.sh
> Submitted batch job 25909275
> $ scontrol update jobid=25909273 Dependency=singleton
> $ scontrol update jobid=25909274 Dependency=singleton,after:25909275
> $ scontrol update jobid=25909275 Dependency=singleton,after:25909273
> $ scontrol release 25909273 25909274 25909275
>
> When releasing the jobs, the scheduler will start job 25909273 which is to be expected. The other jobs will be held due to the singleton and the jobs having the same job name, also expected.
>
> However, when the job finishes, we would have expected job 25909275 to start since the singleton is now free and job 25909274 cannot start due to its dependency of "after:25909275". That is, the expected order would be 25909273 25909275 25909274 and one at a time.
>
> Instead what happens is that job 25909273 starts and completes and then jobs 25909274 and 25909275 remain queued with unsatisfied dependencies.
>
> It is entirely possible that I am thinking of this wrong of course, but I don't see it. Is this expected behaviour?
>
> The content of fakejob.sh is simply this by the way, nothing special:
> #!/bin/bash
> #SBATCH --account=def-jarno
> #SBATCH --time=0:1:30
> #SBATCH --mem=250M
> #SBATCH --ntasks=1
> #SBATCH --job-name=fakejob
>
> echo "Starting fake job"
> sleep 60
> echo "Finished fake job"
>
>
> By the way, I realize this could be done with "afterany" instead of "singleton,after", but since this is a minimal working example it leaves out a lot of details of course.
>
> Thanks,
> Jarno
>
> Jarno van der Kolk, PhD Phys.
> Analyste principal en informatique scientifique | Senior Scientific Computing Specialist
> Solutions TI | IT Solutions
> Université d’Ottawa | University of Ottawa
>
>
>
>
>


More information about the slurm-users mailing list