[slurm-users] ticking time bomb? launching too many jobs in parallel

Jarno van der Kolk jvanderk at uottawa.ca
Thu Aug 29 14:38:41 UTC 2019


On 8/29/19 10:15 AM, Goetz, Patrick G wrote:
> On 8/27/19 11:47 AM, Brian Andrus wrote:
> > 1) If you can, either use xargs or parallel to do the forking so you can
> > limit the number of simultaneous submissions
> >
> 
> Sorry if this is a naive question, but I'm not following how you would
> use parallel with Slurm (unless you're talking about using it on a
> single node).  Parallel is what my non-Slurm users use to
> parallelize/distribute jobs.

Here's an example on how to do so from the Compute Canada docs:
https://docs.computecanada.ca/wiki/GNU_Parallel#Running_on_Multiple_Nodes

It uses the --sshlogin parameter for parallel combined with SLURM_JOB_NODELIST.

Jarno van der Kolk, PhD Phys.
Analyste principal en informatique scientifique | Senior Scientific Computing Specialist
Solutions TI | IT Solutions
Université d’Ottawa | University of Ottawa



More information about the slurm-users mailing list