[slurm-users] federation vs multi-cluster
Brian Andrus
toomuchit at gmail.com
Mon Jun 26 13:58:32 UTC 2023
Mohammed,
Generally, you can think of federation as a way to centrally track and
manage your multiple clusters. More of a way to run single 'sreport' and
'sacct' commands. There are added abilities such as being able to
specify the cluster to send a job to, but for all intents and purposes,
the clusters themselves are independent.
It sounds like what you may want is multiple partitions in the same
cluster. You can have 2 that comprise of the sets of nodes and a third
that is comprised of all nodes. Or a single partition with all nodes and
have features that delineate what node can do what (a node-locked
license, for example). Then you can send a job to a specific subset of
nodes.
Quite a few other ways to design the ability you describe, but separate
clusters is not one of them.
Brian Andrus
On 6/26/2023 6:11 AM, mohammed shambakey wrote:
> Hi
>
> Just out of interest, I wonder what the exact difference between slurm
> multi-cluster and federation (apart from unique job id, and federation
> limitations) is. Usually, I use the "-Mall" option with multi-cluster.
> Initially, I thought the federation will send tasks to more than on
> cluster at once (e.g., I had 2 clusters, each cluster has 2 nodes, and
> each node has 8 CPUs. I submitted more than 16 tasks, and though the
> federation will submit 16 tasks to the first cluster, and the
> additional tasks to the second cluster. But, it seems the federation
> acts similarly to the multi-cluster as only one cluster receives the
> tasks).
>
> Regards
>
> --
> Mohammed
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230626/65e3be0a/attachment.htm>
More information about the slurm-users
mailing list