[slurm-users] good practices

Huda, Zia Ul z.huda at fz-juelich.de
Mon Nov 25 09:37:42 UTC 2019


I would recommend to use dependencies (-d) option available in sbatch or srun.

You need -d afterok:jobid. Hopefully it works.


Zia Ul Huda
Forschungszentrum Jülich GmbH
Institute for Advanced Simulation (IAS)
Jülich Supercomputing Centre (JSC)
52425 Jülich, Germany

Phone: +49 2461 61 96905
E-mail:  z.huda at fz-juelich.de<mailto:z.huda at fz-juelich.de>

WWW: http://www.fz-juelich.de/ias/jsc/

JSC is the coordinator of the
John von Neumann Institute for Computing
and member of the
Gauss Centre for Supercomputing


On 25. Nov 2019, at 10:12, Nigella Sanders <nigella.sanders at gmail.com<mailto:nigella.sanders at gmail.com>> wrote:

Hi all,

I guess this is a simple matter but I still find it confusing.

I have to run 20 jobs on our supercomputer.
Each job takes about 8 hours and every one need the previous one to be completed.
The queue time limit for jobs is 10 hours.

So my first approach is serially launching them in a loop using srun:

for i in {1..20};do
    srun  --time 08:10:00  [options]

However SLURM literature keeps saying that 'srun' should be only used for short command line tests. So that some sysadmins would consider this a bad practice (see this<https://stackoverflow.com/questions/43767866/slurm-srun-vs-sbatch-and-their-parameters>).

My second approach switched to sbatch:

for i in {1..20};do
    sbatch  --time 08:10:00 [options]
    [polling to queue to see if job is done]

But since sbatch returns the prompt I had to add code to check for job termination. Polling make use of sleep command and it is prone to race conditions so it doesn't like to sysadmins either.

I guess there must be a --wait option in some recent versions of SLURM (see this<https://bugs.schedmd.com/show_bug.cgi?id=1685>). Not yet available in our system though.

Is there any prefererable/canonical/friendly way to do this?
Any thoughts would be really appreciated,


Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191125/fe9beea0/attachment.htm>

More information about the slurm-users mailing list