[slurm-users] Multi Regional computing

Eunsong Goh provia at gmail.com
Fri Feb 3 00:15:02 UTC 2023


Hi,

I just finished a cluster which consists of multi-regional and on-premise
servers.

My slurm cluster environment is as follows and I want to run jobs in a
combination of multiple region worker nodes.

Slurm master server was created in GCP KR Region,
Worker node #1 was created in the same region with slurm master server, and
has NVIDIA T4 2 GPUs.
Worker  node #2 was created in GCP US Region, and has NVIDIA T4 2 GPUs.
And Worker node #3 is one of the on premise servers which has NVIDIA T4 8
GPUs.

In this environment, Can I run a slurm job in combination of  #1 server 2
GPUs + #2 servers 2 GPUs?, or #1 server 2 GPUs + #3 on premise server?

Depending on the result of my several tests, multi-regional GPUs
combinations failed.
Those jobs were run in only one region's worker node.

Are there any mechanisms or rules about the combination of multiple worker
nodes? and priority rule in selection of multi worker nodes?

Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230203/370156b0/attachment.htm>


More information about the slurm-users mailing list