[slurm-users] "--batch" option of the sbatch command

Marcus Wagner wagner at itc.rwth-aachen.de
Wed Oct 2 06:47:18 UTC 2019


Hi Loris,

as far as i can read the man page, it indicates that the BatchHost (the 
node, where the batchscript gets executed) will be one of the types 
asked for by --batch.
Moreover, if the allocation does not have any of the requested 
--batch-features, the job will start as usually on the first allocated node.

@Tomo

So, if you want to use the broadwell node, just use

-C broadwell


best
Marcus

On 10/2/19 8:23 AM, Loris Bennett wrote:
> Hi Tomo,
>
> "Uemoto, Tomoki" <fj2770fj at aa.jp.fujitsu.com> writes:
>
>> Hi,all
>> I'm working with slurm 18.08.6 on RHEL7.6
>>    manager : 1node
>>    computes: 2nodes (c001:haswell,c002:broadwell)
>>
>> I am checking the --batch option of the sbatch command.
>> The following Features were set for testing.
>>
>>    # scontrol update nodename=c001 Features=haswell
>>    # scontrol update nodename=c002 Features=broadwell
>>
>> And submitted a sleep job.
>>
>> $ cat sleep_60.sh
>> #!/bin/bash
>>
>> #SBATCH -J sleep_60           # Job name
>> #SBATCH -o job.%j.out         # Name of stdout output file (%j expands to jobId)
>>
>> prun sleep 60
>> $
>>
>> $ sbatch --batch=broadwell --constraint="haswell|broadwell" sleep_60.sh
>> $ squeue -l
>> Wed Oct  2 13:50:40 2019
>>               JOBID PARTITION     NAME     USER    STATE       TIME TIME_LIMI  NODES NODELIST(REASON)
>>                  28    normal sleep_60     test  RUNNING       0:03 1-00:00:00      1 c001
>> $
>>
>> I thought the job would be executed in "c002(broadwell)" by the designation of "--batch=broadwell".
>> However the job was executed in "c001(haswell)"
>> Why isn't it running on "c002(broadwell)" ?
>>
>> Regards,
>> Tomo
> I haven't come across this before, but the documentation seems to
> indicate that the option '--batch' applies only to the batch step.  The
> actual job step, in your case the 'sleep' can then run on any node
> satisfying the condition given via the option '--constraint', so in your
> case either your Broadwell or you Haswell node.
>
> However, I would have thought that, all things being equal, the batch
> step and the job step would run on the same node.  However, I'm not sure
> how a constrain with 'or' is resolved if multiple solutions are
> available.
>
> What happens if you write the constraint as
>
>    --constraint="broadwell|haswell"
>
> ?
>
> Cheers,
>
> Loris
>

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de




More information about the slurm-users mailing list