[slurm-users] Granular or dynamic control of partitions?
m.pacey at lancaster.ac.uk
Fri Aug 4 14:40:34 UTC 2023
We're currently moving our cluster from Grid Engine to SLURM, and I'm having trouble finding the best way to perform a specific bit of partition maintenance. I'm not sure if I'm simply missing something in the manual or if I need to be thinking in a more SLURM-centric way. My basic question: is it possible to 'disable' specific partition/node combinations rather than whole nodes or whole partitions? Here's an example of the sort of thing I'm looking to do:
I have node 'node1' with two partitions 'x' and 'y'. I'd like to remove partition 'y', but there are currently user jobs in that partition on that node. With Grid Engine, I could disable specific queue instances (ie, I could just run "qmod -d y at node1' to disable queue/partition y on node1 and wait for the jobs to complete and then remove the partition. That would be the least disruptive option because:
* Queue/partition 'y' on other nodes would be unaffected
* User jobs for queue/partition 'x' would still be able to launch on node1 the whole time
I can't seem to find a functional equivalent of this in SLURM:
* I can set the whole node to Drain
* I can set the whole partition to Inactive
Is there some way to 'disable' partition y just on node1?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users