[slurm-users] snakemake and slurm in general

Thu Feb 23 16:07:55 UTC 2023

Hi Loris,

I gave this a new subject, as this has nothing to do with my original
question.

Maybe this is what you were looking for in the snakemake documentation:

https://snakemake.readthedocs.io/en/latest/executing/grouping.html#job-grouping

You can basically bundle groups of (snakemake) jobs together and
snakemake will submit them as batch / array jobs to slurm.

However, from what I understand, this means that you will have to
tailor your workflow's group setup to the respective slurm cluster and
the nodes that it has, to make sure that your array jobs fit onto
nodes. This is usually not very portable. And submitting individual
jobs is much more flexible in using available resources, because this
will not block larger amounts of resources at once. So there's always a
tradeoff between different optimizations, here.

In addition, there are very clear limits to how many jobs slurm can
handle in its queue, see for example this discussion:
https://bugs.schedmd.com/show_bug.cgi?id=2366

So, to me, it makes a lot of sense to use snakemake to for example
limit the total number of cores it can request to something matching
the actual number of cores available on the slurm cluster. Otherwise
the queue is quickly overwhelmed and my own (and other people's) jobs
will simply not be scheduled, and running workflow instances will
abort.

In general, Slurm has these limitations, other cluster systems have
others. Workflow managers have to accommodate many different systems
and setups. At least for snakemake I know that as a result it provides
very generalizable solutions for resource management. And I actually
think that this is because such systems originated in research
environments where HPCs are quite common, so they try to interact with
them as nicely as possible.

But both snakemake and nextflow are Open Source projects, maintained by
a community, so they depend on people implementing things like cluster
system support without getting much credit for it (apart from that
stuff works for them, on their local system). And then it depends on
people with deeper knowledge of such cluster systems providing their
help along the way, which is why I am on this list now, asking for
insights. So feel free to dig into the respective code bases with a bit
of that grumpy energy, making snakemake or nextflow a bit better in how
they deal with Slurm.

cheers,
david

On Thu, 2023-02-23 at 15:38 +0100, Loris Bennett wrote:
> Hi David,
> 
> David Laehnemann <david.laehnemann at hhu.de> writes:
> 
> [snip (16 lines)]
> 
> > P.S.: @Loris and @Noam: Exactly, snakemake is a software distinct
> > from
> > slurm that you can use to orchestrate large analysis workflows---on
> > anything from a desktop or laptop computer to all kinds of cluster
> > /
> > cloud systems. In the case of Slurm it will submit each analysis
> > step
> > on a particular sample as a separate job, specifying the resources
> > it
> > needs. The scheduler then handles it from there. But because you
> > can
> > have (hundreds of) thousands of jobs, and with dependencies among
> > them,
> > you can't just submit everything all at once, but have to keep
> > track of
> > where you are at. And make sure you don't submit much more than the
> > system can handle at any time, so you don't overwhelm the Slurm
> > queue.
> 
> [snip (86 lines)]
> 
> I know what Snakemake and other workflow managers, such as Nextflow
> are
> for, but my maybe ill-informed impression is that, while something of
> this sort is obviously needed to manage complex dependencies, the
> current solutions, probably because they originated outside the HPC
> context, to try to do too much.  You say Snakemake helps
> 
>   make sure you don't submit much more than the system can handle
> 
> but that in my view should not be necessary.  Slurm has configuration
> parameters which can be set to limit the number of jobs a user can
> submit and/or run.  And when it comes to submitting (hundreds of)
> thousands of jobs, Nextflow for example currently can't create job
> arrays, and so generates large numbers of jobs with identical
> resource
> requirements, which can prevent backfill from working properly.
> Skimming the documentation for Snakemake, I also could not find any
> reference to Slurm job arrays, so this could also be an issue.
> 
> Just my slightly grumpy 2¢.
> 
> Cheers,
> 
> Loris
>