[slurm-users] Using "srun" on compute nodes -- Ray cluster

Kamil Wilczek kmwil at mimuw.edu.pl
Fri Jul 15 09:17:23 UTC 2022


Dear Slurm Users,

one of my cluster users would like to run a Ray cluster on Slurm.
I noticed that the batch script example requires running the "srun"
command on a compute node, which already is allocated:
https://docs.ray.io/en/latest/cluster/examples/slurm-template.html#slurm-template

This is the first time I see or hear about this type of usage
and I have problems wrapping my head around this.
Is there anything wrong or unusual about this? I understand that
this would allocate some resources on other nodes. Would
Slurm enforce limits properly ("qos" or "partition" limits)?

Kind Regards
-- 
Kamil Wilczek  [https://keys.openpgp.org/]
[D415917E84B8DA5A60E853B6E676ED061316B69B]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 236 bytes
Desc: OpenPGP digital signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220715/0e29f407/attachment.sig>


More information about the slurm-users mailing list