[slurm-users] How should I configure a node with Autodetect=nvml?

Dean Schulze dean.w.schulze at gmail.com
Tue Feb 11 15:27:56 UTC 2020


No other errors in the logs.  Identical slurm.conf on all nodes and
controller.  Only the node with gpus has the gres.conf (with the single
line Autodetect=nvml).

I got this error to stop by removing the Gres=gpu:gp100:2 from the NodeName
line in the controller and the node and removing the gres.conf from the
node.


On Mon, Feb 10, 2020 at 11:41 PM Chris Samuel <chris at csamuel.org> wrote:

> On Monday, 10 February 2020 12:11:30 PM PST Dean Schulze wrote:
>
> > With this configuration I get this message every second in my
> slurmctld.log
> > file:
> >
> >     error: _slurm_rpc_node_registration node=slurmnode1: Invalid argument
>
> What other errors are in the logs?
>
> Could you check that you've got identical slurm.conf and gres.conf files
> everywhere?
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200211/f49b6ed9/attachment.htm>


More information about the slurm-users mailing list