How was your binary compiled?

If it is dynamically linked, please reply with the ldd listing of the binary   ( ldd binary )

 

Jenny

 

From: S L via slurm-users <slurm-users@lists.schedmd.com>
Sent: Tuesday, February 20, 2024 10:55 AM
To: slurm-users@lists.schedmd.com
Subject: [slurm-users] RHEL 8.9+SLURM-23.11.3+MLNX_OFED_LINUX-23.10-1.1.9.0+ OpenMPI-5.0.2

 

Hi All,

We're currently in the process of setting up SLURM on a RHEL 8.9 based cluster. Here's a summary of the steps we've taken so far:

Installed MLNX OFED ConnectX-5.2.
Compiled and installed PMiX and UCX.
Compiled and installed Slurm with PMiX_v4 and UCX support.
Compiled OpenMPI with SLURM, PMIx, libevent, and hwloc support.
All compute nodes are reachable via the IB network.

Problem: While hello world MPI jobs are working fine on multiple nodes, the jobs are not utilizing Infiniband.

srun --mpi=pmix -N2 -n2 --ntasks-per-node=2 ./hello > log.out 2>&1

 

Output from srun --mpi=list:

 

MPI plugin types are...
        none
        cray_shasta
        pmi2
        pmix
specific pmix plugin versions available: pmix_v4


Could someone please point me in the right direction on how to troubleshoot this issue?

Thank you for your assistance.

Sudhakar