[slurm-users] Trouble installing slurm-19.05.1-2.el7.centos.x86_64

Lou Nicotra lnicotra at interactions.com
Fri Aug 16 19:36:35 UTC 2019


Ok, thank you so much for that hint... I will try doing that and report
back.

Thanks!
Lou

On Fri, Aug 16, 2019 at 11:05 AM Brian Andrus <toomuchit at gmail.com> wrote:

> Ah. I suspect your issue may be the cuda. 10.1 which does not
> create/register all the appropriate symlinks and "provides".
> I ran into that trying to install tensorflow.
>
> If you can, downgrade to 10.0, which does a better job of installing
> itself.
>
> Brian
> On 8/16/2019 5:47 AM, Lou Nicotra wrote:
>
> Brian, the package is being built and installed on the master server.  I
> am testing by removing all instances of V18 and installing the newly
> created V19 slurm rpms,  I get the error message on the slurm rpm install,
> all others (ctl, db, ... ) install fine.
>
> After I get the error message, I remove all rpms from V19 and reinstall
> V18 using the same procedure with no issues... And the system sees all
> nodes as it did before trying to install V19
>
> The nvidia libraries are installed via the official Nvidia
> rpm... cuda-repo-rhel7-10-1-local-10.1.105-418.39-1.0-1.x86_64.rpm
> supporting cuda10. Multi GPU server currently used by multiple users (DNN
> training) with no errors of any type while utilizing the nvidia libs/code.
>
> nvidia-smi command shows:  NVIDIA-SMI 418.39       Driver Version: 418.39
>       CUDA Version: 10.1
>
> So, it is definitely something new to the V19 release... I have installed
> 18.08.0, .3, .4 and .8 on the same server and nodes since Sep of 2018 using
> the same procedures and never had any issues... Currently running 18.08.8
>
> Thanks.
> Lou
>
> On Thu, Aug 15, 2019 at 3:07 PM Brian Andrus <toomuchit at gmail.com> wrote:
>
>> Lou,
>>
>> Are you installing on the same machine you built?
>>
>> Are the nvidia libraries installed by RPM or a 'make install' on the box
>> you compiled it on?
>>
>> Brian Andrus
>> On 8/15/2019 7:53 AM, Lou Nicotra wrote:
>>
>> I have tried running ldconfig manually as suggested with
>> slurm-19.05.1-2 and it fails the same way...
>> error: Failed dependencies:
>>         libnvidia-ml.so.1()(64bit) is needed by
>> slurm-19.05.1-2.el7.centos.x86_64
>>
>> ldconfig -p shows:
>> root at panther02 slurm# ldconfig -p|grep libnvidia-ml.
>>         libnvidia-ml.so.1 (libc6,x86-64) => /usr/lib64/libnvidia-ml.so.1
>>         libnvidia-ml.so.1 (libc6) => /lib/libnvidia-ml.so.1
>>         libnvidia-ml.so (libc6,x86-64) => /usr/lib64/libnvidia-ml.so
>>         libnvidia-ml.so (libc6) => /lib/libnvidia-ml.so
>>
>> Just tried the latest release slurm-19.05.2 and it fails in the same
>> way...
>> root at panther02 x86_64# rpm -Uvh slurm-19.05.2-1.el7.centos.x86_64.rpm
>> error: Failed dependencies:
>>         libnvidia-ml.so.1()(64bit) is needed by
>> slurm-19.05.2-1.el7.centos.x86_64
>>
>> Reinstalled slurm-18.08.8 and it installs with no issues... Just
>> like slurm-18.08.03 and slurm-18.08.4 did...  All built on the same machine
>> with rpmbuild -ta command...
>> root at panther02 slurm-18.08.8# rpm -Uvh
>> slurm-18.08.8-1.el7.centos.x86_64.rpm
>> Preparing...                          #################################
>> [100%]
>> Updating / installing...
>>    1:slurm-18.08.8-1.el7.centos       #################################
>> [100%]
>>
>> Oh, well...
>>
>> Lou
>>
>>
>>
>> On Mon, Aug 12, 2019 at 1:32 AM Barbara Krašovec <barbara.krasovec at ijs.si>
>> wrote:
>>
>>> What if you try to run ldconfig manually before building the rpm?
>>>
>>> Cheers,
>>>
>>> Barbara
>>> On 8/8/19 5:57 PM, Lou Nicotra wrote:
>>>
>>> I am running into an error while trying to
>>> install slurm-19.05.1-2.el7.centos.x86_64... Error is as follows:
>>> root at panther02 x86_64# rpm -Uvh slurm-19.05.1-2.el7.centos.x86_64.rpm
>>> error: Failed dependencies:
>>>         libnvidia-ml.so.1()(64bit) is needed by
>>> slurm-19.05.1-2.el7.centos.x86_64
>>>
>>> Packages are built using rpmbuild... And complete with no errors...
>>> + cd /root/rpmbuild/BUILD
>>> + cd slurm-19.05.1-2
>>> + rm -rf /root/rpmbuild/BUILDROOT/slurm-19.05.1-2.el7.centos.x86_64
>>> + exit 0
>>>
>>> Investigation of the output while building the rpm package shows that
>>> nvidia-ml is found:
>>> checking for nvmlInit in -lnvidia-ml... yes
>>> .
>>> .
>>> libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../..
>>> -I../../../../slurm -I../../../.. -I../../../../src/common
>>> -I/usr/local/cuda/include -I/usr/cuda/include -DNUMA_VERSION1_COMPATIBILITY
>>> -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>> -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
>>> -m64 -mtune=generic -pthread -ggdb3 -Wall -g -O1 -fno-strict-aliasing -c
>>> gpu_nvml.c  -fPIC -DPIC -o .libs/gpu_nvml.o
>>> libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I../../../..
>>> -I../../../../slurm -I../../../.. -I../../../../src/common
>>> -I/usr/local/cuda/include -I/usr/cuda/include -DNUMA_VERSION1_COMPATIBILITY
>>> -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>> -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches
>>> -m64 -mtune=generic -pthread -ggdb3 -Wall -g -O1 -fno-strict-aliasing -c
>>> gpu_nvml.c -o gpu_nvml.o >/dev/null 2>&1
>>> /bin/sh ../../../../libtool  --tag=CC   --mode=link gcc
>>>  -DNUMA_VERSION1_COMPATIBILITY -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>> -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4
>>> -grecord-gcc-switches   -m64 -mtune=generic -pthread -ggdb3 -Wall -g -O1
>>> -fno-strict-aliasing -module -avoid-version --export-dynamic -Wl,-z,relro
>>> -o gpu_nvml.la -rpath /usr/lib64/slurm gpu_nvml.lo -lnvidia-ml
>>> libtool: link: gcc -shared  -fPIC -DPIC  .libs/gpu_nvml.o   -lnvidia-ml
>>>  -O2 -g -fstack-protector-strong -grecord-gcc-switches -m64
>>> -mtune=generic -pthread -ggdb3 -g -O1 -Wl,-z -Wl,relro   -pthread
>>> -Wl,-soname -Wl,gpu_nvml.so -o .libs/gpu_nvml.so
>>>
>>> The Makefile in /root/rpmbuild/BUILD/slurm-19.05.1-2/src
>>> includes: NVML_LIBS = -lnvidia-ml
>>> but previous releases did not (slurm-18.08.8) And I was able to compile
>>> and install that release with no issues after building it with rpmbuild...
>>>
>>> My LD_LIBRARY_PATH is
>>> /usr/lib64:/usr/lib:/usr/local/lib64:/usr/local/lib:/var/local/miniconda2/lib/:
>>>
>>> Can anyone provide suggestions on working out this issue?
>>>
>>> Thanks.
>>>  --
>>>
>>> LOU NICOTRA
>>>
>>> IT Systems Engineer - SLT
>>>
>>> Interactions LLC
>>>
>>> o:  908-673-1833 <781-405-5114>
>>>
>>> m: 908-451-6983 <781-405-5114>
>>>
>>> *lnicotra at interactions.com <lnicotra at interactions.com>*
>>> www.interactions.com
>>>
>>>
>>> *******************************************************************************
>>>
>>> This e-mail and any of its attachments may contain Interactions LLC
>>> proprietary information, which is privileged, confidential, or subject to
>>> copyright belonging to the Interactions LLC. This e-mail is intended solely
>>> for the use of the individual or entity to which it is addressed. If you
>>> are not the intended recipient of this e-mail, you are hereby notified that
>>> any dissemination, distribution, copying, or action taken in relation to
>>> the contents of and attachments to this e-mail is strictly prohibited and
>>> may be unlawful. If you have received this e-mail in error, please notify
>>> the sender immediately and permanently delete the original and any copy of
>>> this e-mail and any printout. Thank You.
>>>
>>>
>>> *******************************************************************************
>>>
>>>
>>>
>>
>> --
>>
>> LOU NICOTRA
>>
>> IT Systems Engineer - SLT
>>
>> Interactions LLC
>>
>> o:  908-673-1833 <781-405-5114>
>>
>> m: 908-451-6983 <781-405-5114>
>>
>> *lnicotra at interactions.com <lnicotra at interactions.com>*
>> www.interactions.com
>>
>>
>> *******************************************************************************
>>
>> This e-mail and any of its attachments may contain Interactions LLC
>> proprietary information, which is privileged, confidential, or subject to
>> copyright belonging to the Interactions LLC. This e-mail is intended solely
>> for the use of the individual or entity to which it is addressed. If you
>> are not the intended recipient of this e-mail, you are hereby notified that
>> any dissemination, distribution, copying, or action taken in relation to
>> the contents of and attachments to this e-mail is strictly prohibited and
>> may be unlawful. If you have received this e-mail in error, please notify
>> the sender immediately and permanently delete the original and any copy of
>> this e-mail and any printout. Thank You.
>>
>>
>> *******************************************************************************
>>
>>
>>
>
> --
>
> LOU NICOTRA
>
> IT Systems Engineer - SLT
>
> Interactions LLC
>
> o:  908-673-1833 <781-405-5114>
>
> m: 908-451-6983 <781-405-5114>
>
> *lnicotra at interactions.com <lnicotra at interactions.com>*
> www.interactions.com
>
>
> *******************************************************************************
>
> This e-mail and any of its attachments may contain Interactions LLC
> proprietary information, which is privileged, confidential, or subject to
> copyright belonging to the Interactions LLC. This e-mail is intended solely
> for the use of the individual or entity to which it is addressed. If you
> are not the intended recipient of this e-mail, you are hereby notified that
> any dissemination, distribution, copying, or action taken in relation to
> the contents of and attachments to this e-mail is strictly prohibited and
> may be unlawful. If you have received this e-mail in error, please notify
> the sender immediately and permanently delete the original and any copy of
> this e-mail and any printout. Thank You.
>
>
> *******************************************************************************
>
>
>

-- 

LOU NICOTRA

IT Systems Engineer - SLT

Interactions LLC

o:  908-673-1833 <781-405-5114>

m: 908-451-6983 <781-405-5114>

*lnicotra at interactions.com <lnicotra at interactions.com>*
www.interactions.com

-- 





*******************************************************************************




This e-mail and any of its attachments may contain
Interactions LLC 
proprietary information, which is privileged,
confidential, or subject to 
copyright belonging to the Interactions
LLC. This e-mail is intended solely 
for the use of the individual or
entity to which it is addressed. If you 
are not the intended recipient of this
e-mail, you are hereby notified that 
any dissemination, distribution, copying,
or action taken in relation to 
the contents of and attachments to this e-mail
is strictly prohibited and 
may be unlawful. If you have received this e-mail in
error, please notify 
the sender immediately and permanently delete the original
and any copy of 
this e-mail and any printout. Thank You.  




******************************************************************************* 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190816/c5f1b98e/attachment-0001.htm>


More information about the slurm-users mailing list