[slurm-users] Slurm node weights
novosirj at rutgers.edu
Thu Jul 25 17:02:05 UTC 2019
My understanding is that the topology plug-in will overrule this, and that may or may not be a problem depending on your environment. I had a ticket in to SchedMD about this, because it looked like our nodes were getting allocated in the exact reverse order. I suspected this was because our higher weight equipment was on a switch with fewer nodes, and the scheduler was trying to keep workloads contiguous (opting to preserve larger blocks where possible). SchedMD was not able to duplicate this with my configuration, however, so it remains a suspicion of mine, and I’ve heard that there IS an interaction of some sort.
|| \\UTGERS, |---------------------------*O*---------------------------
||_// the State | Ryan Novosielski - novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
|| \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark
On Jul 25, 2019, at 06:51, David Baker <D.J.Baker at soton.ac.uk<mailto:D.J.Baker at soton.ac.uk>> wrote:
I'm experimenting with node weights and I'm very puzzled by what I see. Looking at the documentation I gathered that jobs will be allocated to the nodes with the lowest weight which satisfies their requirements. I have 3 nodes in a partition and I have defined the nodes like so..
NodeName=orange01 Procs=48 Sockets=8 CoresPerSocket=6 ThreadsPerCore=1 RealMemory=1018990 State=UNKNOWN Weight=50
NodeName=orange[02-03] Procs=48 Sockets=8 CoresPerSocket=6 ThreadsPerCore=1 RealMemory=1018990 State=UNKNOWN
So, given that the default weight is 1 I would expect jobs to be allocated to orange02 and orange03 first. I find, however that my test job is always allocated to orange01 with the higher weight. Have I overlooked something? I would appreciate your advice, please.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users