[slurm-users] Need help with controller issues

Dean Schulze dean.w.schulze at gmail.com
Tue Dec 10 21:57:59 UTC 2019


There's a problem with accounting_storage/mysql plugin:

$ sudo  slurmdbd -D -vvvv
slurmdbd: debug:  Log file re-opened
slurmdbd: pidfile not locked, assuming no running daemon
slurmdbd: debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
slurmdbd: debug:  Munge authentication plugin loaded
slurmdbd: debug3: Success.
slurmdbd: debug3: Trying to load plugin
/usr/lib/slurm/accounting_storage_mysql.so
slurmdbd: error: Couldn't find the specified plugin name for
accounting_storage/mysql looking at all files
slurmdbd: error: cannot find accounting_storage plugin for
accounting_storage/mysql
slurmdbd: error: cannot create accounting_storage context for
accounting_storage/mysql
slurmdbd: fatal: Unable to initialize accounting_storage/mysql accounting
storage plugin


This bug report from a couple of years ago indicates a source code issue:

https://bugs.schedmd.com/show_bug.cgi?id=3278

This must have been fixed by now, though.

I built using slurm-19.05.2.  Does anyone know if this has been fixed in
19.05.4?



On Tue, Dec 10, 2019 at 2:05 PM Dean Schulze <dean.w.schulze at gmail.com>
wrote:

> I'm trying to set up my first slurm installation following these
> instructions:
>
> https://github.com/nateGeorge/slurm_gpu_ubuntu
>
> I've had to deviate a little bit because I'm using virtual machines that
> don't have GPUs, so I don't have a gres.conf file and in
> /etc/slurm/slurm.conf I don't have an entry like Gres=gpu:2 on the last
> line.
>
> On my controller vm I get errors when trying to do simple commnands:
>
> $ sinfo
> slurm_load_partitions: Unable to contact slurm controller (connect failure)
>
> $ sudo sacctmgr add cluster compute-cluster
> sacctmgr: error: slurm_persist_conn_open_without_init: failed to open
> persistent connection to localhost:6819: Connection refused
> sacctmgr: error: slurmdbd: Sending PersistInit msg: Connection refused
> sacctmgr: error: Problem talking to the database: Connection refused
>
>
> Something is supposed to be running on port 6819, but netstat shows
> nothing using that port.  What is supposed to be running on 6819?
>
> My database (Maria) is running.  I can connect to it with `sudo mysql -U
> root`.
>
> When I boot my controller which services are supposed to be running and on
> which ports?
>
> Thanks.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191210/4142e762/attachment-0001.htm>


More information about the slurm-users mailing list