<div dir="ltr"><div dir="ltr"><div>Please try something very simple such as a hello world program or </div><div><span style="text-align:left;color:black;text-transform:none;text-indent:0px;letter-spacing:normal;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt;font-style:normal;font-variant:normal;font-weight:400;text-decoration:none;word-spacing:0px;display:inline;white-space:normal;float:none;background-color:rgb(255,255,255)">srun -N2 -n8 hostname</span></div><div><br></div><div>What is the error message which you have ?</div></div></div><br><div class="gmail_quote"><div class="gmail_attr" dir="ltr">On Fri, 12 Jul 2019 at 07:07, Pär Lundö <<a href="mailto:par.lundo@foi.se">par.lundo@foi.se</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;padding-left:1ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-left-style:solid">
<div style="margin-top:8px">
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
<br>
Hi there Slurm-experts! </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
I am trouble using or running a python-mpi program involving more than one node. The pythom-mpi program is very simple, it only lists the number of ranks that is available in its environment. I have a munge-daemon running prior to starting the slurm-service
and the program works when using a single node (so I suppose munge is working). </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
In addition, I have tested to run a simple sbatch-script where each available node (four nodes) states its hostname and returns.
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
Since authentication with Slurm is used via munge, do I need a passwordless SSH communication between the slurmctl and the nodes? (I found a guide,probably outdated stating that passwordless SSH communication is a neccessity for slurm, <a href="HTTP://admin-magazine.com/HPC/Articles/Resource-Management-with-Slurm" target="_blank">HTTP://admin-magazine.com/HPC/Articles/Resource-Management-with-Slurm</a>).
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
<br>
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
I run the python-mpi program via a sbatch-script,invoking a srun-command. Each node has 8 CPUs.
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
The srun-command is : </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
”srun -N2 -n8 python3 python-mpi.py” , </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
when tested on two nodes. </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
It works fine running on a single node(with ”-N1” instead of ”-N2”), but it is aborted or stopped when running on two nodes.
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
Should I have ”-n16” when running on two nodes? (In order to allocate the complete number of CPUs available of the two nodes.)
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
Slurm is configured and built with pmix. </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
I am running Slurm 19.05 on Ubuntu 18.04 as server and the nodes are running same slurm-version on Ubuntu 18.10.
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
<br>
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
Best rehards, </div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
<br>
</div>
<div style="color:black;font-family:Calibri,Tahoma,Arial,Helvetica,sans-serif;font-size:11pt" dir="ltr">
Palle </div>
</div>
</blockquote></div>