[slurm-users] speed / efficiency of sacct vs. scontrol

Mon Feb 27 13:38:16 UTC 2023

Dear Ward,

if used correctly (and that is a big caveat for any method for
interacting with a cluster system), snakemake will only submit as many
jobs as can fit within the resources of the cluster at one point of
time (or however much resources you tell snakemake that it can use). So
unless there are thousands of cores available (or you "lie" to
snakemake, telling it that there are much more cores than actually
exist), it will only ever submit hundreds of jobs (or a lot less, if
the jobs each require multiple cores). Accordingly, any queries will
also only be for this number of jobs that snakemake currently has
submitted. And snakemake will only submit new jobs, once it registers
previously submitted jobs as finished.

So workflow managers can actually help reduce the strain on the
scheduler, by only ever submitting stuff within the general limits of
the system (as opposed to, for example, using some bash loop to just
submit all of your analysis steps or samples at once). And for example,
snakemake has a mechanism to batch a number of smaller jobs into larger
jobs for submission on the cluster, so this might be something to
suggest to your users that cause trouble through using snakemake
(especially the `--group-components` mechanism):
https://snakemake.readthedocs.io/en/latest/executing/grouping.html

The query mechanism for job status is a different story. And I'm
specifically here on this mailing list to get as much input as possible
to improve this -- and welcome anybody who wants to chime in on my
respective work-in-progress pull request right here:
https://github.com/snakemake/snakemake/pull/2136

And if you are seeing a workflow management system causing trouble on
your system, probably the most sustainable way of getting this resolved
is to file issues or pull requests with the respective project, with
suggestions like the ones you made. For snakemake, a second good point
to currently chime in, would be the issue discussing Slurm job array
support: https://github.com/snakemake/snakemake/issues/301

And for Nextflow, another commonly used workflow manager in my field
(bioinformatics), there's also an issue discussing Slurm job array
support:
https://github.com/nextflow-io/nextflow/issues/1477

cheers,
david

On Mon, 2023-02-27 at 13:24 +0100, Ward Poelmans wrote:
> On 24/02/2023 18:34, David Laehnemann wrote:
> > Those queries then should not have to happen too often, although do
> > you
> > have any indication of a range for when you say "you still wouldn't
> > want to query the status too frequently." Because I don't really,
> > and
> > would probably opt for some compromise of every 30 seconds or so.
> 
> I think this is exactly why hpc sys admins are sometimes not very
> happy about these tools. You're talking about 10000 of jobs on one
> hand yet you want fetch the status every 30 seconds? What is the
> point of that other then overloading the scheduler?
> 
> We're telling your users not to query the slurm too often and usually
> give 5 minutes as a good interval. You have to let slurm do it's job.
> There is no point in querying in a loop every 30 seconds when we're
> talking about large numbers of jobs.
> 
> 
> Ward