[slurm-users] Distribute jobs in similar nodes in the same partition

Antonio Lara antonio.lara at uam.es
Fri May 11 03:36:56 MDT 2018


Hello everyone,

Hopefully someone can help me with this, I cannot find in the manual if 
this is even possible:

I'm a system administrator, and the following question is from the 
administrator point of view, not the user's point of view:

I work with a cluster which has a partition containing many nodes. These 
nodes belong to "different categories". This is, we bought at once 
several machines that are of the same type, and we did this several 
times. So, for example, we have 10 machines of type A, 20 machines of 
type B and 15 machines of type C. Machines of type A are more powerful 
than machines of type B, which are more powerful than machines of type C.

What I am trying to achieve is that Slurm "forces" parallelized jobs to 
be allocated in machines of the same type, if possible. That is, that 
there is some type of priority which tries to allocate only machines of 
type A, or only machines of type B, or only of type C, and only 
distribute jobs among machines of different types when there are not 
enough nodes of the same type available.

Does anyone know if this is possible? The idea behind this is that 
slower machines are not delaying the calculations in faster machines 
when a job is distributed among them, and all machines work more or less 
at the same pace.

I've been told that It is NOT an option to create different partitions, 
each containing only one type of machine.

Please, note that I'm not looking for a way to choose as a user which 
nodes to use for a job, what I need is that slurm does that, and decides 
what nodes to use, using similar nodes if available.

The closest that I could find in the manual was using consumable 
resources, but I think this is not what I need, there are several 
examples, but they don't seem to fit with this.

Thank you for your help!




More information about the slurm-users mailing list