[slurm-users] Strange behaviour with dynamically linked binary in batch job

Sebastian Potthoff s.potthoff at uni-muenster.de
Thu Mar 31 10:44:40 UTC 2022

Just a quick follow up, that I could resolve the issue. Maybe this helps someone in the future.

 $BASH_ENV was pointing to a deprecated script, resetting the module environment. This only is an issue for non-interactive, non-login shells (i.e. sbatch) and the issue was therefore not seen when running in an interactive slurm session. 

I addition „ldd“ seems to create a subprocess/shell? Using 

/usr/lib64/ld-linux-x86-64.so.2 --list /path/to/binary

 did not have the described issue. 

Concerning other self-compile binaries, which seemed to work fine: those had a set an RPATH and were therefore always finding the correct MPI libs.

Kind regards

> Am 30.03.2022 um 18:08 schrieb Sebastian Potthoff <s.potthoff at uni-muenster.de>:
> Hi Noam,
> Thanks for your suggestion - I already did this and confirmed the modules and LD_LIBARARY_PATH are set correctly. Also if there was something wrong here, all of this would not work with self-compiled binaries, which it does… which baffles me :-/
>> Am 30.03.2022 um 17:51 schrieb Bernstein, Noam CIV USN NRL (6393) Washington DC (USA) <noam.bernstein at nrl.navy.mil <mailto:noam.bernstein at nrl.navy.mil>>:
>> One possibility is that something about the environment in the running batch job is making the "module load" commands fail, which they can do without any error (for old fashioned tcl-based env modules).  Do "module list" after, and echo $LD_LIBRARY_PATH, to confirm that it really is being set correctly in the batch job.

