[slurm-users] slurm SBATCH - Multiple Nodes, Same SLURMD_NODENAME
Marcus Wagner
wagner at itc.rwth-aachen.de
Mon Jul 16 01:12:29 MDT 2018
Hi Sam,
this is expected and how bash works.
Regarding the #SBATCH --output problem this seems to be an error,
because only one output file is created (I just tested it myself).
Regarding variable substitution:
srun echo SLURMD_NODENAME:$SLURMD_NODENAME
SLURM_ARRAY_TASK_ID:$SLURM_ARRAY_TASK_ID
SLURM_ARRAY_JOB_ID:$SLURM_ARRAY_JOB_ID SLURM_JOB_ID:$SLURM_JOB_ID
SLURM_TASK_PID:$SLURM_TASK_PID
bash evaluates the variables before the actual program is started,
otherwise e.g. "cd $HOME" would not work, because in most unixoid
systems $HOME never exist, but the variable HOME would point to the
user's home directory.
So, in fact, here's what you're letting go:
srun echo SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
SLURM_JOB_ID:2056 SLURM_TASK_PID:644
This is exactly the output you received.
Here's what you could try:
srun echo 'SLURMD_NODENAME:$SLURMD_NODENAME
SLURM_ARRAY_TASK_ID:$SLURM_ARRAY_TASK_ID
SLURM_ARRAY_JOB_ID:$SLURM_ARRAY_JOB_ID
SLURM_JOB_ID:$SLURM_JOB_ID_LURM_TASK_PID_TASK_PID'.
the single quotes (no backticks!) should prevent bash from replacing the
variables.
Best
Marcus
On 07/13/2018 06:54 PM, Sam wrote:
>
> StackOverflow Thread:
> https://stackoverflow.com/questions/51328917/slurm-sbatch-multiple-nodes-same-slurmd-nodename
>
>
> possibly related to:
> https://groups.google.com/forum/#!topic/slurm-users/suclnO2V0aA
> <https://groups.google.com/forum/#%21topic/slurm-users/suclnO2V0aA>
>
> - slurm-wlm 17.11.2
> - Installed from Ubuntu Apt repos, Ubuntu:18.04
>
> We have a cluster of 20 identical nodes.
> Running the simple script below give me a confusing problem.
> All the jobs think they are running on node3, while running the
> hostname command gives the accurate answer. This is also a problem for
> the output filename. I expected to have many different outputs, but I
> get just one, with 'node3' in the filename. This seems to be a Bash
> Eval() / Variable substitution error.
> Wrapping
>
> $SLURMD_NODENAME
>
> in a
>
> bash -c "echo Bash3: \$SLURMD_NODENAME"
>
> works. But why did I have to do this? This workaround won't work for
> the #SBATCH --output
>
> cn.job:
>
> #!/bin/bash
> #SBATCH --output=/share/output.txt.%j.%J.%a.%A.%n.%N.%s.%t.%x
> #SBATCH --time=00:00:30
> #SBATCH --tasks-per-node=2
> #SBATCH --nodes=4
> srun hostname
> srun bash -c "echo Bash2: \$(hostname)"
> srun echo SLURMD_NODENAME:$SLURMD_NODENAME
> SLURM_ARRAY_TASK_ID:$SLURM_ARRAY_TASK_ID
> SLURM_ARRAY_JOB_ID:$SLURM_ARRAY_JOB_ID SLURM_JOB_ID:$SLURM_JOB_ID
> SLURM_TASK_PID:$SLURM_TASK_PID
> srun bash -c "echo Bash3: \$SLURMD_NODENAME"
> srun sleep 20
>
> Ran like:
>
> sbatch cn.job
>
> produces this output:
>
> **/share/output.txt.2056.2056.4294967294.2056.0.node3.4294967294.0.cn.job**
>
> node3
> node3
> node6
> node4
> node5
> node6
> node4
> node5
> Bash2: node3
> Bash2: node6
> Bash2: node4
> Bash2: node5
> Bash2: node3
> Bash2: node4
> Bash2: node6
> Bash2: node5
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> SLURMD_NODENAME:node3 SLURM_ARRAY_TASK_ID: SLURM_ARRAY_JOB_ID:
> SLURM_JOB_ID:2056 SLURM_TASK_PID:6441
> Bash3: node3
> Bash3: node5
> Bash3: node3
> Bash3: node4
> Bash3: node6
> Bash3: node4
> Bash3: node6
> Bash3: node5
>
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180716/6be4a432/attachment.html>
More information about the slurm-users
mailing list