Hi everyone, for accounting reasons, I need to create only one job across two or more federated clusters with two or more srun steps.

I’m trying with hetjobs but it's not clear to me from the documentation (https://slurm.schedmd.com/heterogeneous_jobs.html) if this is possible and how to do it.

I'm trying with this script, but the steps are executed on only the first cluster.

Can you tell me if there is a mistake in the hetjob or if it has to be done in another way?

 

#!/bin/bash

 

#SBATCH hetjob

#SBATCH --clusters=cluster2

srun -v --het-group=0 hostname

 

#SBATCH hetjob

#SBATCH --clusters=cluster3

srun -v --het-group=1 hostname




NICE SRL, viale Monte Grappa 3/5, 20124 Milano, Italia, Registro delle Imprese di Milano Monza Brianza Lodi REA n. 2096882, Capitale Sociale: 10.329,14 EUR i.v., Cod. Fisc. e P.IVA 01133050052, Societa con Socio Unico