[slurm-users] Need help with controller issues

Dean Schulze dean.w.schulze at gmail.com
Tue Dec 10 21:05:57 UTC 2019


I'm trying to set up my first slurm installation following these
instructions:

https://github.com/nateGeorge/slurm_gpu_ubuntu

I've had to deviate a little bit because I'm using virtual machines that
don't have GPUs, so I don't have a gres.conf file and in
/etc/slurm/slurm.conf I don't have an entry like Gres=gpu:2 on the last
line.

On my controller vm I get errors when trying to do simple commnands:

$ sinfo
slurm_load_partitions: Unable to contact slurm controller (connect failure)

$ sudo sacctmgr add cluster compute-cluster
sacctmgr: error: slurm_persist_conn_open_without_init: failed to open
persistent connection to localhost:6819: Connection refused
sacctmgr: error: slurmdbd: Sending PersistInit msg: Connection refused
sacctmgr: error: Problem talking to the database: Connection refused


Something is supposed to be running on port 6819, but netstat shows nothing
using that port.  What is supposed to be running on 6819?

My database (Maria) is running.  I can connect to it with `sudo mysql -U
root`.

When I boot my controller which services are supposed to be running and on
which ports?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191210/1b90c8c1/attachment.htm>


More information about the slurm-users mailing list