[slurm-users] Need help with controller issues

Renfro, Michael Renfro at tntech.edu
Tue Dec 10 21:11:52 UTC 2019


What do you get from

systemctl status slurmdbd
systemctl status slurmctld

I’m assuming at least slurmdbd isn’t running.

> On Dec 10, 2019, at 3:05 PM, Dean Schulze <dean.w.schulze at gmail.com> wrote:
> 
> External Email Warning
> This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.
> I'm trying to set up my first slurm installation following these instructions:
> 
> https://github.com/nateGeorge/slurm_gpu_ubuntu
> 
> I've had to deviate a little bit because I'm using virtual machines that don't have GPUs, so I don't have a gres.conf file and in /etc/slurm/slurm.conf I don't have an entry like Gres=gpu:2 on the last line.
> 
> On my controller vm I get errors when trying to do simple commnands:
> 
> $ sinfo
> slurm_load_partitions: Unable to contact slurm controller (connect failure)
> 
> $ sudo sacctmgr add cluster compute-cluster
> sacctmgr: error: slurm_persist_conn_open_without_init: failed to open persistent connection to localhost:6819: Connection refused
> sacctmgr: error: slurmdbd: Sending PersistInit msg: Connection refused
> sacctmgr: error: Problem talking to the database: Connection refused
> 
> 
> Something is supposed to be running on port 6819, but netstat shows nothing using that port.  What is supposed to be running on 6819?
> 
> My database (Maria) is running.  I can connect to it with `sudo mysql -U root`.
> 
> When I boot my controller which services are supposed to be running and on which ports?
> 
> Thanks.
> 



More information about the slurm-users mailing list