[slurm-users] Interactive jobs using "srun --pty bash" and MPI
dragowsky at case.edu
Wed Nov 2 21:46:48 UTC 2022
When we started using Slurm some years ago, obtaining the interactive
resources through "srun ... --pty bash" was the standard that we adopted.
We are now running Slurm v22.05 (happily), though we noticed recently some
limitations when claiming resources to demonstrate or develop in an mpi
environment. A colleague today was revisiting a finding dating back to
January, which is:
I am having issues running interactive MPI jobs in a traditional way. It
> just stays there without execution.
> srun -N 2 -n 4 --mem=4gb --pty bash
> mpirun -n 4 ~/prime-mpi
> Hower, it does run with:
> srun -N 2 -n 4 --mem=4gb ~/prime-mpi
As indicated, the first approach, taking the resources to test/demo MPI
jobs via "srun ... --pty bash" no longer supports the launching of the
job. We also checked the srun environment using verbosity, and found that
the job steps are executed and terminate before the prompt is achieved in
the requested shell.
While we infer that changes were implemented, would someone be able to
direct us to documentation or a discussion as to the changes, and the
motivation? We do not doubt that there is compelling motivation, we ask to
improve our understanding. As was summarized in and shared amongst our
team following our review of the current operational behaviour:
> - "srun ... executable" works fine
> - "salloc -n4", "ssh <node>", "srun -n4 <executable>" works
> Using "mpirun -n4 <executable>" does not work
> - In batch mode, both mpirun and srun work.
Thanks to any and all who take the time to shed light on this matter.
E.M. (Em) Dragowsky, Ph.D.
Research Computing -- UTech
Case Western Reserve University
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the slurm-users