[slurm-users] Use a portion of resources already allocated for a script

Michael Lamparski diagonaldevice at gmail.com
Fri Sep 21 08:45:33 MDT 2018


> Today I discovered that sbatch can also create job steps (albeit
awkwardly, via --jobid), and it can obviously run sbatch scripts... but as
far as I can tell, it cannot run synchronously!

Rats.  I just discovered that sbatch has a "--wait" option, but it must
have been added after 2.2.4, because it isn't available on this machine.

Michael

On Thu, Sep 20, 2018 at 7:01 PM Michael Lamparski <diagonaldevice at gmail.com>
wrote:

> Hello all,
>
> For years I've been looking for what I might consider to be the holy grail
> of composable resource allocation in slurm jobs:
>
> * A command that can be run inside of an sbatch script...
> * ...which immediately and synchronously invokes another sbatch script
> (which may or may not invoke mpirun in turn)...
> * ...using a subset of the currently allocated resources.
>
> This is the smallest unit of functionality that would compose well with
> existing tools in UNIX for orchestration.  For instance, I could use xargs
> as a semaphore to let each node work on one input at a time, and for a
> given input I could have an arbitrarily complex python script decide
> dynamically what computations to run.
>
> Years of Google and manpage searches have continually failed me.
>
> * salloc can synchronously run an sbatch script, but as far as I can tell,
> it cannot make job steps, only jobs.
> * srun can run sychronously and make job steps, but as far as I can tell,
> it cannot call a script which calls mpirun (it insists on *replacing*
> mpirun)
> * Today I discovered that sbatch can also create job steps (albeit
> awkwardly, via --jobid), and it can obviously run sbatch scripts... but as
> far as I can tell, it cannot run synchronously!
>
> One can't help but wonder whether this is a deliberate omission or just
> criminal oversight!
>
> Today I snapped and started working on a synchronous wrapper around
> sbatch; the plan is to use --jobid=$SLURM_JOB_ID, find out the job step id
> (somehow...), and then sattach to it.  I say this knowing it'll probably
> make your skin crawl.  And I ask: What do you think I ought to do instead?
>
> Michael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180921/f0fd50f7/attachment.html>


More information about the slurm-users mailing list