[slurm-users] "--batch" option of the sbatch command
Marcus Wagner
wagner at itc.rwth-aachen.de
Wed Oct 2 06:47:18 UTC 2019
Hi Loris,
as far as i can read the man page, it indicates that the BatchHost (the
node, where the batchscript gets executed) will be one of the types
asked for by --batch.
Moreover, if the allocation does not have any of the requested
--batch-features, the job will start as usually on the first allocated node.
@Tomo
So, if you want to use the broadwell node, just use
-C broadwell
best
Marcus
On 10/2/19 8:23 AM, Loris Bennett wrote:
> Hi Tomo,
>
> "Uemoto, Tomoki" <fj2770fj at aa.jp.fujitsu.com> writes:
>
>> Hi,all
>> I'm working with slurm 18.08.6 on RHEL7.6
>> manager : 1node
>> computes: 2nodes (c001:haswell,c002:broadwell)
>>
>> I am checking the --batch option of the sbatch command.
>> The following Features were set for testing.
>>
>> # scontrol update nodename=c001 Features=haswell
>> # scontrol update nodename=c002 Features=broadwell
>>
>> And submitted a sleep job.
>>
>> $ cat sleep_60.sh
>> #!/bin/bash
>>
>> #SBATCH -J sleep_60 # Job name
>> #SBATCH -o job.%j.out # Name of stdout output file (%j expands to jobId)
>>
>> prun sleep 60
>> $
>>
>> $ sbatch --batch=broadwell --constraint="haswell|broadwell" sleep_60.sh
>> $ squeue -l
>> Wed Oct 2 13:50:40 2019
>> JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
>> 28 normal sleep_60 test RUNNING 0:03 1-00:00:00 1 c001
>> $
>>
>> I thought the job would be executed in "c002(broadwell)" by the designation of "--batch=broadwell".
>> However the job was executed in "c001(haswell)"
>> Why isn't it running on "c002(broadwell)" ?
>>
>> Regards,
>> Tomo
> I haven't come across this before, but the documentation seems to
> indicate that the option '--batch' applies only to the batch step. The
> actual job step, in your case the 'sleep' can then run on any node
> satisfying the condition given via the option '--constraint', so in your
> case either your Broadwell or you Haswell node.
>
> However, I would have thought that, all things being equal, the batch
> step and the job step would run on the same node. However, I'm not sure
> how a constrain with 'or' is resolved if multiple solutions are
> available.
>
> What happens if you write the constraint as
>
> --constraint="broadwell|haswell"
>
> ?
>
> Cheers,
>
> Loris
>
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de
More information about the slurm-users
mailing list