[slurm-users] Distribute jobs in similar nodes in the same partition
Hadrian Djohari
hxd58 at case.edu
Fri May 11 04:13:16 MDT 2018
You can use node feature in defining the node types in slurm.conf.
Then when requesting for the job, use -C <feature name> toy just use those
node type.
On Fri, May 11, 2018, 5:38 AM Antonio Lara <antonio.lara at uam.es> wrote:
> Hello everyone,
>
> Hopefully someone can help me with this, I cannot find in the manual if
> this is even possible:
>
> I'm a system administrator, and the following question is from the
> administrator point of view, not the user's point of view:
>
> I work with a cluster which has a partition containing many nodes. These
> nodes belong to "different categories". This is, we bought at once
> several machines that are of the same type, and we did this several
> times. So, for example, we have 10 machines of type A, 20 machines of
> type B and 15 machines of type C. Machines of type A are more powerful
> than machines of type B, which are more powerful than machines of type C.
>
> What I am trying to achieve is that Slurm "forces" parallelized jobs to
> be allocated in machines of the same type, if possible. That is, that
> there is some type of priority which tries to allocate only machines of
> type A, or only machines of type B, or only of type C, and only
> distribute jobs among machines of different types when there are not
> enough nodes of the same type available.
>
> Does anyone know if this is possible? The idea behind this is that
> slower machines are not delaying the calculations in faster machines
> when a job is distributed among them, and all machines work more or less
> at the same pace.
>
> I've been told that It is NOT an option to create different partitions,
> each containing only one type of machine.
>
> Please, note that I'm not looking for a way to choose as a user which
> nodes to use for a job, what I need is that slurm does that, and decides
> what nodes to use, using similar nodes if available.
>
> The closest that I could find in the manual was using consumable
> resources, but I think this is not what I need, there are several
> examples, but they don't seem to fit with this.
>
> Thank you for your help!
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180511/7faf4d52/attachment-0001.html>
More information about the slurm-users
mailing list