[slurm-users] Header lengths are longer than data received after changing SelectType & GresTypes to use MPS

Eric Berquist berquist at isi.edu
Wed Apr 8 14:20:16 UTC 2020


I just ran into this issue. Specifically, SLURM looks for the NVML header file, which comes with CUDA or DCGM, in addition to the library that comes with the drivers. The check is at https://github.com/SchedMD/slurm/blob/a763a008b7700321b51aad2e619deab00638a379/auxdir/x_ac_nvml.m4#L32. Once you’ve built SLURM, it’s enough to just have the GPU drivers on the nodes where SLURM will be installed.

On Apr 8, 2020, at 9:32 AM, dean.w.schulze at gmail.com<mailto:dean.w.schulze at gmail.com> wrote:

I believe in order to compile for nvml you'll have to compile on a system with an Nvidia gpu installed otherwise the Nvidia driver and libraries won't install on that system.

-----Original Message-----
From: slurm-users <slurm-users-bounces at lists.schedmd.com<mailto:slurm-users-bounces at lists.schedmd.com>> On Behalf Of Christopher Samuel
Sent: Tuesday, April 7, 2020 10:08 PM
To: slurm-users at lists.schedmd.com<mailto:slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Header lengths are longer than data received after changing SelectType & GresTypes to use MPS

On 4/7/20 2:48 PM, Robert Kudyba wrote:

How can I get this to work by loading the correct Bright module?

You can't - you will need to recompile Slurm.

The error says:

Apr 07 16:52:33 node001 slurmd[299181]: fatal: We were configured to autodetect nvml functionality, but we weren't able to find that lib when Slurm was configured.

So when Slurm was built the libraries you are telling it to use now were not detected and so the configure script disabled that functionality as it would not otherwise have been able to compile.

All the best,
Chris
--
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200408/e42396e5/attachment.htm>


More information about the slurm-users mailing list