[slurm-users] Having errors trying to run a packed jobs script

Tue Nov 7 03:19:32 MST 2017

Hello Marius,

Am 07.11.2017 um 10:12 schrieb Marius Cetateanu:
> I have a very small cluster(if it even could be called a cluster) with only
> one node for the moment; the node is a dual Xeon with 14 cores/socket,
> hyper-threaded and 256GB of memory, running CentOS 7.3.

Bigger than a small cluster a decade ago... ;)
Nice workhorse I guess.

[...]
> The moment I schedule my script I can see that there are 50 instances of
> my process started and running but just a bit afterwards only 5 or so of
> them
> 
> I can see running - so I only get full load for the first 50 instances
> and not afterwards.
"a bit afterwards" is too vague to reason anything aside sched_interval
just being the default 60s:
https://slurm.schedmd.com/sched_config.html

What's the (average) runtime of the jobs?
If your jobs are not running longer than the sched_interval default you
might want to *decrease* that.

Regards,
Benjamin
-- 
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
☎ +49 3641 9 44323