[slurm-users] Granular or dynamic control of partitions?

Pacey, Mike m.pacey at lancaster.ac.uk
Fri Aug 4 14:40:34 UTC 2023


Hi folks,

We're currently moving our cluster from Grid Engine to SLURM, and I'm having trouble finding the best way to perform a specific bit of partition maintenance. I'm not sure if I'm simply missing something in the manual or if I need to be thinking in a more SLURM-centric way. My basic question: is it possible to 'disable' specific partition/node combinations rather than whole nodes or whole partitions? Here's an example of the sort of thing I'm looking to do:

I have node 'node1' with two partitions 'x' and 'y'. I'd like to remove partition 'y', but there are currently user jobs in that partition on that node. With Grid Engine, I could disable specific queue instances (ie, I could just run "qmod -d y at node1' to disable queue/partition y on node1 and wait for the jobs to complete and then remove the partition. That would be the least disruptive option because:

  *   Queue/partition 'y' on other nodes would be unaffected
  *   User jobs for queue/partition 'x' would still be able to launch on node1 the whole time

I can't seem to find a functional equivalent of this in SLURM:

  *   I can set the whole node to Drain
  *   I can set the whole partition to Inactive

Is there some way to 'disable' partition y just on node1?

Regards,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230804/b9f31faa/attachment.htm>


More information about the slurm-users mailing list