[slurm-users] network/communication failure

Turner, Heath Hturner at eng.ua.edu
Mon May 21 09:05:21 MDT 2018


If anyone has advice, I would really appreciate...

I am running (just installed) slurm-11.17.6, with a master + 2 hosts.  It works locally on the master (controller + execution).  However, I cannot establish communication from master [triumph01] with the 2 hosts [triumph02,triumph03].  Here is some more info:

1. munge is running, and munge verification tests all pass.
2. system clocks are in sync on master/hosts.
3. identical slurm.conf files are on master/hosts.
4. configuration of resources (memory/cpus/etc) are correct and have been confirmed on all machines (all hardware is identical).
5. I have attached:
	a) slurm.conf
	b) log file from master slurmctld
	c) log file from host slurmd

Any ideas about what to try next?

Heath Turner

Professor
Graduate Coordinator
Chemical and Biological Engineering
http://che.eng.ua.edu
 
University of Alabama
3448 SEC, Box 870203
Tuscaloosa, AL  35487
(205) 348-1733 (phone)
(205) 561-7450 (cell)
(205) 348-7558 (fax)
hturner at eng.ua.edu
http://turnerresearchgroup.ua.edu

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: master-slurmctld-log.txt
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180521/b0baaa4b/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: slurm.conf.txt
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180521/b0baaa4b/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: host-slurmd-log.txt
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180521/b0baaa4b/attachment-0002.txt>


More information about the slurm-users mailing list