[slurm-users] Creating groups of nodes with exclusive access to a resources within a partition.

Rich Cardwell richc at graphcore.ai
Tue Feb 1 10:42:22 UTC 2022


Hi,

I am wondering if this possible with slurm, I have an application where I want to create groups of  nodes (group size would be between 1 and n servers) which have exclusive access to a shared resources and then on that group of nodes allow a configurable amount of jobs to run.

For example I could have:

partition: bulk, containing:

group1, max 4 jobs:
  - node1
  - node2
  - node3
  - node4


group 2, max 2 jobs:
   - node5

group 3, max 1 job:
  - node6
  - node7
  - node8
  - node9


Ideally the user could submit a job to a generic queue and I could set a configurable gres/license in the background for them and the jobs get placed in a free group or pend if it requires the exclusive resource.

I've taken a look at:

  1.  Using the job submit lua plugin to look at the groups and if a group has available resources set a gres so the job is correctly placed.
  2.  Licenses, but I can't see how to limit a license to a group of hosts without creating clusters. Can you limit licenses to specific nodes?
  3.  On the scheduler, script building the node configuration and update the node gres and issue a 'scontrol reconfigure'

Option 3 works, but isn't great.

So I would really like the be able to use a plugin to look at the current allocation and set the a gres/license/partition for the user in the background, is it possible for the job_submit lua plugin to access an external resources or the license part of the slurm? As I could use that.

Or am I missing something or doing something very wrong.

Thanks in advance for any assistance its much appreciated.


Rich Cardwell
Snr IT Engineer
richc at graphcore.ai
www.graphcore.ai<http://www.graphcore.ai>

[cid:722300fa-fc1b-40b5-9612-67c20557eb96]


** We have updated our privacy policy, which contains important information about how we collect and process your personal data. To read the policy, please click here<http://www.graphcore.ai/privacy> **

This email and its attachments are intended solely for the addressed recipients and may contain confidential or legally privileged information.
If you are not the intended recipient you must not copy, distribute or disseminate this email in any way; to do so may be unlawful.

Any personal data/special category personal data herein are processed in accordance with UK data protection legislation.
All associated feasible security measures are in place. Further details are available from the Privacy Notice on the website and/or from the Company.

Graphcore Limited (registered in England and Wales with registration number 10185006) is registered at 107 Cheapside, London, UK, EC2V 6DN.
This message was scanned for viruses upon transmission. However Graphcore accepts no liability for any such transmission.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220201/ee21cee3/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-nu40ohyf.png
Type: image/png
Size: 2196 bytes
Desc: Outlook-nu40ohyf.png
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220201/ee21cee3/attachment-0001.png>


More information about the slurm-users mailing list