<div dir="ltr"><div>Ronan, sorry to ask but this is a bit unclear.</div><div><br></div><div>Are you unable to launch ANY sessions with srun?</div><div>In which case you need to look at the logs to see why the job is not being scheduled.<br></div><div><br></div><div>Is it only the hostname command which fails?</div><div><br></div><div>I would guess very much you have already run an ssh into a node and run the hostname command manually.</div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On 17 July 2018 at 09:50, Buckley, Ronan <span dir="ltr"><<a href="mailto:Ronan.Buckley@dell.com" target="_blank">Ronan.Buckley@dell.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div link="#0563C1" vlink="#954F72" lang="EN-US">
<div class="m_-2832551837549021488WordSection1">
<p class="MsoNormal"><span style="color:#1f497d">Yes I do.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#1f497d"><u></u> <u></u></span></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> slurm-users [mailto:<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">slurm-users-bounces@<wbr>lists.schedmd.com</a>]
<b>On Behalf Of </b>Williams, Gareth (IM&T, Clayton)<br>
<b>Sent:</b> Tuesday, July 17, 2018 12:33 AM<br>
<b>To:</b> Slurm User Community List<br>
<b>Subject:</b> Re: [slurm-users] 'srun hostname' hangs on the command line<u></u><u></u></p>
</div>
</div><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><span style="color:#1f497d" lang="EN-AU">Do you get the same problem as a non-root user?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="color:#1f497d" lang="EN-AU"><u></u> <u></u></span></p>
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> slurm-users [<a href="mailto:slurm-users-bounces@lists.schedmd.com" target="_blank">mailto:slurm-users-bounces@<wbr>lists.schedmd.com</a>]
<b>On Behalf Of </b>Buckley, Ronan<br>
<b>Sent:</b> Tuesday, 17 July 2018 12:53 AM<br>
<b>To:</b> <a href="mailto:slurm-users@lists.schedmd.com" target="_blank">slurm-users@lists.schedmd.com</a><br>
<b>Subject:</b> [slurm-users] 'srun hostname' hangs on the command line<u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"><span lang="EN-AU"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif">Hi All,<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif">Verbose mode doesn’t show much.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif">I hashed out the hostnames.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif">Any ideas/suggestions?<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif"># srun hostname<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">^Csrun: interrupt (one more within 1 sec to abort)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: task 0: unknown<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">^Z<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">[1]+ Stopped srun hostname<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">#<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif"><u></u> <u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif"># srun -v hostname<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: defined options for program `srun'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: --------------- ---------------------<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: user : `root'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: uid : 0<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: gid : 0<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cwd : /root<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: ntasks : 1 (default)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: nodes : 1 (default)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: jobid : 4294967294 (default)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: partition : default<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: profile : `NotSet'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: job name : `(null)'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: reservation : `(null)'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: burst_buffer : `(null)'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: wckey : `(null)'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cpu_freq_min : 4294967294<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cpu_freq_max : 4294967294<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cpu_freq_gov : 4294967294<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: switches : -1<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: wait-for-switches : -1<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: distribution : unknown<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cpu_bind : default (0)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: mem_bind : default (0)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: verbose : 1<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: slurmd_debug : 0<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: immediate : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: label output : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: unbuffered IO : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: overcommit : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: threads : 60<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: checkpoint_dir : /var/slurm/checkpoint<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: wait : 0<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: nice : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: account : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: comment : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: dependency : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: exclusive : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: bcast : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: qos : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: constraints :<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: geometry : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: reboot : yes<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: rotate : no<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: preserve_env : false<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: network : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: propagate : NONE<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: prolog : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: epilog : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: mail_type : NONE<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: mail_user : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: task_prolog : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: task_epilog : (null)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: multi_prog : no<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: sockets-per-node : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: cores-per-socket : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: threads-per-core : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: ntasks-per-node : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: ntasks-per-socket : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: ntasks-per-core : -2<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: plane_size : 4294967294<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: core-spec : NA<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: power :<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: remote command : `hostname'<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: Waiting for nodes to boot (delay looping 450 times @ 0.100000 secs x index)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: Nodes ####### are ready for job<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: jobid 50871: nodes(1):`#######', cpu counts: 64(x1)<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: launching 50871.0 on host #######, 1 tasks: 0<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: route default plugin loaded<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: error: timeout waiting for task launch, started 0 of 1 tasks<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: Job step 50871.0 aborted before step completely launched.<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: Job step aborted: Waiting up to 32 seconds for job step to finish.<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">srun: error: Timed out waiting for job step to complete<u></u><u></u></span></i></p>
<p class="MsoNormal" style="margin-left:.5in"><i><span style="font-size:9.0pt;font-family:"Tahoma",sans-serif">#<u></u><u></u></span></i></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma",sans-serif"><u></u> <u></u></span></p>
<p class="MsoNormal">Rgds<u></u><u></u></p>
<p class="MsoNormal"><u></u> <u></u></p>
</div></div></div>
</div>
</blockquote></div><br></div>