[slurm-users] Running pyMPI on several nodes

John Hearns hearnsj at googlemail.com
Fri Jul 12 06:46:15 UTC 2019


MY apology. You do say that the Python program simply printe the rank - so
is a hello world program.

On Fri, 12 Jul 2019 at 07:45, John Hearns <hearnsj at googlemail.com> wrote:

> Please try something very simple such as a hello world program or
> srun -N2 -n8 hostname
>
> What is the error message which you have ?
>
> On Fri, 12 Jul 2019 at 07:07, Pär Lundö <par.lundo at foi.se> wrote:
>
>>
>> Hi there Slurm-experts!
>> I am  trouble using or running a python-mpi program involving more than
>> one node. The pythom-mpi program is very simple, it only lists the number
>> of ranks that is available in its environment. I have a munge-daemon
>> running prior to starting the slurm-service and the program works when
>> using a single node (so I suppose munge is working).
>> In addition, I have tested to run a simple sbatch-script where each
>> available node (four nodes) states its hostname and returns.
>> Since authentication with Slurm is used via munge, do I need a
>> passwordless SSH communication between the slurmctl and the nodes? (I found
>> a guide,probably outdated stating that passwordless SSH communication is a
>> neccessity for slurm,
>> HTTP://admin-magazine.com/HPC/Articles/Resource-Management-with-Slurm).
>>
>> I run the python-mpi program via a sbatch-script,invoking a srun-command.
>> Each node has 8 CPUs.
>> The srun-command is :
>> ”srun -N2 -n8 python3 python-mpi.py” ,
>> when tested on two nodes.
>> It works fine running on a single node(with ”-N1” instead of ”-N2”), but
>> it is aborted or stopped when running on two nodes.
>> Should I have ”-n16” when running on two nodes? (In order to allocate the
>> complete number of CPUs available of the two nodes.)
>> Slurm is configured and built with pmix.
>> I am running Slurm 19.05 on Ubuntu 18.04 as server and the nodes are
>> running same slurm-version on Ubuntu 18.10.
>>
>> Best rehards,
>>
>> Palle
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190712/2edf1088/attachment.htm>


More information about the slurm-users mailing list