[slurm-users] Array Job Node Allocation
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Wed Mar 21 04:34:49 MDT 2018
Hi Emyr,
Perhaps you could be more explicit about the i/o boundedness and have jobs request an io gres as well as compute and memory resource. You could then set the amount of io resource per node (and maybe globally - possibly separate iolocal and ioglobal). Then you could avoid io contention locally and globally instead of just shifting the problem about and hoping that spreading load helps. Another option is to declare that there are fewer cpus per node ( which has its own problems).
Of course, difficulties in estimating the io needs per jobs might make this whole idea broken... Mostly I wanted to point out that there are other ways of thinking about the problem - and round-robin may just shift the problem around in an ugly way.
best wishes,
Gareth
-----Original Message-----
From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of Emyr James
Sent: Wednesday, 21 March 2018 4:54 PM
To: slurm-users at schedmd.com
Subject: [slurm-users] Array Job Node Allocation
Dear all,
I would like to be able to have an array job load nodes with a round-robin allocation instead of what seems to be the default method of loading the first node till full before moving on to the next node. Our cluster is used for bioinformatics and jobs tend to be serial high throughput with one or a few threads on a node as opposed to jobs being distributed across nodes. The default whereby nodes are filled sequentially doesn't work well for us given that jobs tend to be i/o bound.
I've seen the thread starting at
https://groups.google.com/d/msg/slurm-users/uiKuFF8C-kU/mnJ1VcESBwAJ but I can't see the solution mentioned there (periodically setting node weights according to load) working for array jobs as it submits jobs in clumps.
The LLN strategy seems to be what I'm after but as in the thread above I can't get it to work. Has anyone managed to get this working ?
Regards,
Emyr
--
The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
More information about the slurm-users
mailing list