[slurm-users] Having errors trying to run a packed jobs script

Benjamin Redling benjamin.rampe at uni-jena.de
Tue Nov 7 03:19:32 MST 2017

Hello Marius,

Am 07.11.2017 um 10:12 schrieb Marius Cetateanu:
> I have a very small cluster(if it even could be called a cluster) with only
> one node for the moment; the node is a dual Xeon with 14 cores/socket,
> hyper-threaded and 256GB of memory, running CentOS 7.3.

Bigger than a small cluster a decade ago... ;)
Nice workhorse I guess.

> The moment I schedule my script I can see that there are 50 instances of
> my process started and running but just a bit afterwards only 5 or so of
> them
> I can see running - so I only get full load for the first 50 instances
> and not afterwards.
"a bit afterwards" is too vague to reason anything aside sched_interval
just being the default 60s:

What's the (average) runtime of the jobs?
If your jobs are not running longer than the sched_interval default you
might want to *decrease* that.

