<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Hi, Juergen --</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">This is really useful information -- thanks for the pointer, and for taking the time to share!</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">And, Jacob -- can you point us to any primary documentation based on Juergen's observation that the change took place with v20.11?</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">With the emphasis on salloc, I find in the examples:</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"> To get an allocation, and open a new xterm in which srun commands may be typed interactively:<br><br> $ salloc -N16 xterm<br> salloc: Granted job allocation 65537</div></blockquote><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">which works as advertised (I'm not sure that i miss xterms or not -- at least on our cluster we dont configure them explicitly as a primary terminal tool)</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">And thanks also Chris and Jason for the validation and endorsement of these approaches.<br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Best, all!</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">~ Em<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Nov 2, 2022 at 5:47 PM Juergen Salk <<a href="mailto:juergen.salk@uni-ulm.de">juergen.salk@uni-ulm.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Em,<br>
<br>
this is most probably because in Slurm version 20.11 the behaviour of srun was <br>
changed to not allow job steps to overlap by default any more.<br>
<br>
An interactive job launched by `srun --pty bash´ always creates a regular <br>
step (step <jobid>.0), so mpirun or srun will hang when trying to launch another <br>
job step from within this interactive job step as they would overlap. <br>
<br>
You could try using the --overlap flag or `export SLURM_OVERLAP=1´<br>
before running your interactive job to revert to the previous behavior<br>
that allows steps to overlap. <br>
<br>
However, instead of using `srun --pty bash´ for launching interactive jobs, it <br>
is now recommended to use `salloc´ and have `LaunchParameters=use_interactive_step´ <br>
set in slurm.conf. <br>
<br>
`salloc´ with `LaunchParameters=use_interactive_step´ enabled will<br>
create a special interactive step (step <jobid>.interactive) that does not <br>
consume any resources and, thus, does not interfere with a new job step <br>
launched from within this special interactive job step.<br>
<br>
Hope this helps.<br>
<br>
Best regards<br>
Jürgen<br>
<br>
<br>
* Em Dragowsky <<a href="mailto:dragowsky@case.edu" target="_blank">dragowsky@case.edu</a>> [221102 15:46]:<br>
> Greetings --<br>
> <br>
> When we started using Slurm some years ago, obtaining the interactive<br>
> resources through "srun ... --pty bash" was the standard that we adopted.<br>
> We are now running Slurm v22.05 (happily), though we noticed recently some<br>
> limitations when claiming resources to demonstrate or develop in an mpi<br>
> environment. A colleague today was revisiting a finding dating back to<br>
> January, which is:<br>
> <br>
> I am having issues running interactive MPI jobs in a traditional way. It<br>
> > just stays there without execution.<br>
> ><br>
> > srun -N 2 -n 4 --mem=4gb --pty bash<br>
> > mpirun -n 4 ~/prime-mpi<br>
> ><br>
> > Hower, it does run with:<br>
> > srun -N 2 -n 4 --mem=4gb ~/prime-mpi<br>
> ><br>
> <br>
> As indicated, the first approach, taking the resources to test/demo MPI<br>
> jobs via "srun ... --pty bash" no longer supports the launching of the<br>
> job. We also checked the srun environment using verbosity, and found that<br>
> the job steps are executed and terminate before the prompt is achieved in<br>
> the requested shell.<br>
> <br>
> While we infer that changes were implemented, would someone be able to<br>
> direct us to documentation or a discussion as to the changes, and the<br>
> motivation? We do not doubt that there is compelling motivation, we ask to<br>
> improve our understanding. As was summarized in and shared amongst our<br>
> team following our review of the current operational behaviour:<br>
> <br>
> ><br>
> > - "srun ... executable" works fine<br>
> > - "salloc -n4", "ssh <node>", "srun -n4 <executable>" works<br>
> > Using "mpirun -n4 <executable>" does not work<br>
> > - In batch mode, both mpirun and srun work.<br>
> ><br>
> ><br>
> Thanks to any and all who take the time to shed light on this matter.<br>
> <br>
> <br>
> -- <br>
> E.M. (Em) Dragowsky, Ph.D.<br>
> Research Computing -- UTech<br>
> Case Western Reserve University<br>
> (216) 368-0082<br>
> they/them<br>
<br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div><span style="font-family:verdana,sans-serif"><span style="font-size:small">E.M. (Em) Dragowsky, Ph.D.</span><br></span></div><div><span style="font-family:verdana,sans-serif"><font size="2">Research Computing -- UTech</font></span></div><div><span style="font-family:verdana,sans-serif"><font size="2">Case Western Reserve University<br></font></span></div><div><font size="2" face="comic sans ms, sans-serif"><span style="font-family:verdana,sans-serif">(216) 368-0082</span><br></font></div><div><font size="2" face="comic sans ms, sans-serif"><font face="verdana,sans-serif">they/them</font><br></font></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>