[slurm-users] federation vs multi-cluster

Brian Andrus toomuchit at gmail.com
Mon Jun 26 13:58:32 UTC 2023


Mohammed,

Generally, you can think of federation as a way to centrally track and 
manage your multiple clusters. More of a way to run single 'sreport' and 
'sacct' commands. There are added abilities such as being able to 
specify the cluster to send a job to, but for all intents and purposes, 
the clusters themselves are independent.

It sounds like what you may want is multiple partitions in the same 
cluster. You can have 2 that comprise of the sets of nodes and a third 
that is comprised of all nodes. Or a single partition with all nodes and 
have features that delineate what node can do what (a node-locked 
license, for example). Then you can send a job to a specific subset of 
nodes.

Quite a few other ways to design the ability you describe, but separate 
clusters is not one of them.

Brian Andrus

On 6/26/2023 6:11 AM, mohammed shambakey wrote:
> Hi
>
> Just out of interest, I wonder what the exact difference between slurm 
> multi-cluster and federation (apart from unique job id, and federation 
> limitations) is. Usually, I use the "-Mall" option with multi-cluster. 
> Initially, I thought the federation will send tasks to more than on 
> cluster at once (e.g., I had 2 clusters, each cluster has 2 nodes, and 
> each node has 8 CPUs. I submitted more than 16 tasks, and though the 
> federation will submit 16 tasks to the first cluster, and the 
> additional tasks to the second cluster. But, it seems the federation 
> acts similarly to the multi-cluster as only one cluster receives the 
> tasks).
>
> Regards
>
> -- 
> Mohammed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230626/65e3be0a/attachment.htm>


More information about the slurm-users mailing list