Perhaps you could be more explicit about the i/o boundedness and have jobs request an io gres as well as compute and memory resource.  You could then set the amount of io resource per node (and maybe globally - possibly separate iolocal and ioglobal). Then you could avoid io contention locally and globally instead of just shifting the problem about and hoping that spreading load helps. Another option is to declare that there are fewer cpus per node ( which has its own problems).

Of course, difficulties in estimating the io needs per jobs might make this whole idea broken...  Mostly I wanted to point out that there are other ways of thinking about the problem -  and round-robin may just shift the problem around in an ugly way.

I would like to be able to have an array job load nodes with a round-robin allocation instead of what seems to be the default method of loading the first node till full before moving on to the next node. Our cluster is used for bioinformatics and jobs tend to be serial high throughput with one or a few threads on a node as opposed to jobs being distributed across nodes. The default whereby nodes are filled sequentially doesn't work well for us given that jobs tend to be i/o bound.

I've seen the thread starting at
https://groups.google.com/d/msg/slurm-users/uiKuFF8C-kU/mnJ1VcESBwAJ but I can't see the solution mentioned there (periodically setting node weights according to load) working for array jobs as it submits jobs in clumps.

The LLN strategy seems to be what I'm after but as in the thread above I can't get it to work. Has anyone managed to get this working ?



