[slurm-users] "--batch" option of the sbatch command
fj2770fj at aa.jp.fujitsu.com
Wed Oct 2 09:01:25 UTC 2019
Thanks for your comment.
I understood as follows.
In the case of a typical Linux cluster, 'Batchhost' would be the compute node zero of the allocation.
* From BatchHost at https://slurm.schedmd.com/squeue.html
Therefore, it is not necessary to specify the '--batch' option.
From: slurm-users [mailto:slurm-users-bounces at lists.schedmd.com] On Behalf Of Marcus Wagner
Sent: Wednesday, October 02, 2019 3:47 PM
To: slurm-users at lists.schedmd.com
Subject: Re: [slurm-users] "--batch" option of the sbatch command
as far as i can read the man page, it indicates that the BatchHost (the node, where the batchscript gets executed) will be one of the types asked for by --batch.
Moreover, if the allocation does not have any of the requested --batch-features, the job will start as usually on the first allocated node.
So, if you want to use the broadwell node, just use
On 10/2/19 8:23 AM, Loris Bennett wrote:
> Hi Tomo,
> "Uemoto, Tomoki" <fj2770fj at aa.jp.fujitsu.com> writes:
>> I'm working with slurm 18.08.6 on RHEL7.6
>> manager : 1node
>> computes: 2nodes (c001:haswell,c002:broadwell)
>> I am checking the --batch option of the sbatch command.
>> The following Features were set for testing.
>> # scontrol update nodename=c001 Features=haswell
>> # scontrol update nodename=c002 Features=broadwell
>> And submitted a sleep job.
>> $ cat sleep_60.sh
>> #SBATCH -J sleep_60 # Job name
>> #SBATCH -o job.%j.out # Name of stdout output file (%j expands to jobId)
>> prun sleep 60
>> $ sbatch --batch=broadwell --constraint="haswell|broadwell"
>> sleep_60.sh $ squeue -l Wed Oct 2 13:50:40 2019
>> JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON)
>> 28 normal sleep_60 test RUNNING 0:03 1-00:00:00 1 c001
>> I thought the job would be executed in "c002(broadwell)" by the designation of "--batch=broadwell".
>> However the job was executed in "c001(haswell)"
>> Why isn't it running on "c002(broadwell)" ?
> I haven't come across this before, but the documentation seems to
> indicate that the option '--batch' applies only to the batch step.
> The actual job step, in your case the 'sleep' can then run on any node
> satisfying the condition given via the option '--constraint', so in
> your case either your Broadwell or you Haswell node.
> However, I would have thought that, all things being equal, the batch
> step and the job step would run on the same node. However, I'm not
> sure how a constrain with 'or' is resolved if multiple solutions are
> What happens if you write the constraint as
Marcus Wagner, Dipl.-Inf.
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
More information about the slurm-users