[slurm-users] introduce short delay starting multiple parallel jobs with srun

Yakupov, Renat /DZNE Renat.Yakupov at dzne.de
Fri Nov 10 07:18:42 MST 2017


Thank you for the suggestion, Bill. I will take a look at the Launcher.

Best,
Renat.

________________________________________
From: slurm-users [slurm-users-bounces at lists.schedmd.com] On Behalf Of Bill Barth [bbarth at tacc.utexas.edu]
Sent: Friday, November 10, 2017 2:05 PM
To: Slurm User Community List
Subject: Re: [slurm-users] introduce short delay starting multiple parallel jobs with srun

Renat,

Not to toot our own horn too much, but TACC develops a tool designed for launching lots of individual serial tasks on a parallel system and distributing them well inside a single scheduler job. You might check out: https://github.com/TACC/launcher. There’s no support for the tool, but it’s all open source and mostly written entirely in Bash so you can analyze its behavior or modify it if you need to. Tasks can be scheduled dynamically if some finish faster than others to distribute the load more efficiently.

We don’t enable job arrays on our systems either (though our Slurm install is much more recent and does support them) for internal reasons, so we provide the Launcher for folks who have similar needs to you.

Best,
Bill.

--
Bill Barth, Ph.D., Director, HPC
bbarth at tacc.utexas.edu        |   Phone: (512) 232-7069
Office: ROC 1.435            |   Fax:   (512) 475-9445



On 11/10/17, 4:00 AM, "slurm-users on behalf of Yakupov, Renat /DZNE" <slurm-users-bounces at lists.schedmd.com on behalf of Renat.Yakupov at dzne.de> wrote:

    Hi Gennaro,

    they are all tasks within the same job... If I start multiple jobs, I can just use --begin.
    I think I found a solution using shell PID $$, which is unique for each task.

    Best,
    Renat.
    ________________________________________
    From: slurm-users [slurm-users-bounces at lists.schedmd.com] On Behalf Of Gennaro Oliva [oliva.g at na.icar.cnr.it]
    Sent: Friday, November 10, 2017 10:51 AM
    To: Slurm User Community List
    Subject: Re: [slurm-users] introduce short delay starting multiple parallel jobs with srun

    Hi Renat,

    On Fri, Nov 10, 2017 at 10:03:37AM +0100, Yakupov, Renat /DZNE wrote:
    > slurm 2.5.0! seeing today's announcement about a double digit version
    > release, that is... ancient!

    according to the NEWS file, native jobarray support was introduced in
    version 2.6.0pre1 ... bad luck.

    You can still save the first SLURM_JOB_ID somehow and make the
    following jobs compute their sleep time using:

    $((SLURM_JOB_ID - FIRST_SLURM_JOB_ID))

    Regards
    --
    Gennaro Oliva







More information about the slurm-users mailing list