[slurm-users] getting closer
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Fri Jun 28 07:39:12 UTC 2019
On 6/28/19 9:18 AM, Valerio Bellizzomi wrote:
> On Fri, 2019-06-28 at 08:51 +0200, Valerio Bellizzomi wrote:
>> On Thu, 2019-06-27 at 18:35 +0200, Valerio Bellizzomi wrote:
>>> The nodes are now communicating however when I run the command
>>>
>>> srun -w compute02 /bin/ls
>>>
>>> it remains stuck and there is no output on the submit machine.
>>>
>>> on the compute02 there is a Communication error and Timeout.
>>>
>>> the network ports 6817 and 6818 are open.
>>
>>
>> Looking at the firewall logs, slurmctld wants to connect back to a range
>> of ports which are closed.
>
>
> As a test I stopped the firewall service on the submit machine, now the
> command above is working fine.
You may want to check your firewall settings according to Slurm's
requirements. I've summarized this in my Wiki page:
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#configure-firewall-for-slurm-daemons
/Ole
More information about the slurm-users
mailing list