[slurm-users] issue with mpirun when using through slurm / pmix

Bas van der Vlies bas.vandervlies at surf.nl
Fri Oct 22 06:35:08 UTC 2021


I have no other solution this was the solution at our site.

> On 22 Oct 2021, at 03:19, pankajd <pankajd at cdac.in> wrote:
> 
> thanks, but after setting PMIX_MCA_psec=native, now mpirun hangs and does not produce any output. 
> 
> On October 21, 2021 at 9:21 PM Bas van der Vlies <bas.vandervlies at surf.nl> wrote: 
> > At our side we also add this problem that the pmix lib was compiled with 
> > munge support. We solved it by setting this environment variable: 
> > * export PMIX_MCA_psec=native of export PMIX_MCA_psec=none 
> > 
> > Regard, 
> > 
> > Bas 
> > 
> > On 21/10/2021 16:59, Pankaj Dorlikar wrote: 
> > > Hi, 
> > > 
> > > When using slurm-20.11.7 compiled with pmix-3.2.3,  and job is submitted 
> > > like below : 
> > > 
> > > srun -N 1 -c 2 --pty /bin/bash 
> > > 
> > > on the allocated compute node, when I execute the below command, I get 
> > > the PMI error with return value -46 
> > > 
> > > mpirun -c 2 /bin/hostname 
> > > 
> > > -------------------------------------------------------------------------- 
> > > 
> > > A requested component was not found, or was unable to be opened.  This 
> > > 
> > > means that this component is either not installed or is unable to be 
> > > 
> > > used on your system (e.g., sometimes this means that shared libraries 
> > > 
> > > that the component requires are unable to be found/loaded).  Note that 
> > > 
> > > PMIX stopped checking at the first component that it did not find. 
> > > 
> > > Host: cnode9 
> > > 
> > > Framework: psec 
> > > 
> > > Component: munge 
> > > 
> > > -------------------------------------------------------------------------- 
> > > 
> > > -------------------------------------------------------------------------- 
> > > 
> > > It looks like pmix_init failed for some reason; your parallel process is 
> > > 
> > > likely to abort.  There are many reasons that a parallel process can 
> > > 
> > > fail during pmix_init; some of which are due to configuration or 
> > > 
> > > environment problems.  This failure appears to be an internal failure; 
> > > 
> > > here's some additional information (which may only be relevant to an 
> > > 
> > > PMIX developer): 
> > > 
> > >   pmix_psec_base_open failed 
> > > 
> > >   --> Returned value -46 instead of PMIX_SUCCESS 
> > > 
> > > -------------------------------------------------------------------------- 
> > > 
> > > [cnode9:2708617] PMIX ERROR: NOT-FOUND in file server/pmix_server.c at 
> > > line 237 
> > > 
> > > 
> > > ------------------------------------------------------------------------------------------------------------ 
> > > 
> > > [ C-DAC is on Social-Media too. Kindly follow us at: 
> > > Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] 
> > > 
> > > This e-mail is for the sole use of the intended recipient(s) and may 
> > > contain confidential and privileged information. If you are not the 
> > > intended recipient, please contact the sender by reply e-mail and destroy 
> > > all copies and the original message. Any unauthorized review, use, 
> > > disclosure, dissemination, forwarding, printing or copying of this email 
> > > is strictly prohibited and appropriate legal action will be taken. 
> > > ------------------------------------------------------------------------------------------------------------ 
> > > 
> > 
> > -- 
> > Bas van der Vlies 
> > | HPCV Supercomputing | Internal Services | SURF | 
> > https://userinfo.surfsara.nl | 
> > | Science Park 140 | 1098 XG Amsterdam | Phone: +31208001300 | 
> > | bas.vandervlies at surf.nl
> 
> 
> For assimilation and dissemination of knowledge, visit cakes.cdac.in 
> 
> ------------------------------------------------------------------------------------------------------------ 
> [ C-DAC is on Social-Media too. Kindly follow us at: 
> Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ] 
> 
> This e-mail is for the sole use of the intended recipient(s) and may 
> contain confidential and privileged information. If you are not the 
> intended recipient, please contact the sender by reply e-mail and destroy 
> all copies and the original message. Any unauthorized review, use, 
> disclosure, dissemination, forwarding, printing or copying of this email 
> is strictly prohibited and appropriate legal action will be taken. 
> ------------------------------------------------------------------------------------------------------------




More information about the slurm-users mailing list