[slurm-users] Question About Restarting Slurmctld and Slurmd

Alex Chekholko alex at calicolabs.com
Wed Jul 24 18:24:34 UTC 2019


Hi Chad,

Here is the most generally useful process I ended up with, implemented in a
local custom utility script.

#Update slurm.conf everywhere
#Stop slurmctld
#Restart all slurmd processes
#Start slurmctld

per:
https://wiki.fysik.dtu.dk/niflheim/SLURM#add-and-remove-nodes

I think you only will affect running jobs if you delete a partition, then
you have a different procedure:
https://slurm.schedmd.com/faq.html#delete_partition

However, I think there are a few other parameters in slurm.conf that may be
more disruptive but they have warnings in the man page.

Regards,
Alex

On Wed, Jul 24, 2019 at 10:28 AM Julius, Chad <Chad.Julius at sdstate.edu>
wrote:

> All,
>
>
>
> As our user base grows, we are getting close to hitting the default
> MaxJobCount of 10,000.  Is it safe to edit this value, along with some
> fairshare settings while the cluster jobs are actively running?  As in,
> what are the ramifications if I changed the slurm.conf file and then
> restarted slurmctld and the slurmd services on all of the nodes.
>
>
>
> My assumption is that the jobs will stay running and/or queued but I would
> like some reassurance.
>
>
>
> Thanks,
>
>
>
> Chad
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190724/c1e2d8d1/attachment-0001.htm>


More information about the slurm-users mailing list