[slurm-users] Question About Restarting Slurmctld and Slurmd
Alex Chekholko
alex at calicolabs.com
Wed Jul 24 18:24:34 UTC 2019
Hi Chad,
Here is the most generally useful process I ended up with, implemented in a
local custom utility script.
#Update slurm.conf everywhere
#Stop slurmctld
#Restart all slurmd processes
#Start slurmctld
per:
https://wiki.fysik.dtu.dk/niflheim/SLURM#add-and-remove-nodes
I think you only will affect running jobs if you delete a partition, then
you have a different procedure:
https://slurm.schedmd.com/faq.html#delete_partition
However, I think there are a few other parameters in slurm.conf that may be
more disruptive but they have warnings in the man page.
Regards,
Alex
On Wed, Jul 24, 2019 at 10:28 AM Julius, Chad <Chad.Julius at sdstate.edu>
wrote:
> All,
>
>
>
> As our user base grows, we are getting close to hitting the default
> MaxJobCount of 10,000. Is it safe to edit this value, along with some
> fairshare settings while the cluster jobs are actively running? As in,
> what are the ramifications if I changed the slurm.conf file and then
> restarted slurmctld and the slurmd services on all of the nodes.
>
>
>
> My assumption is that the jobs will stay running and/or queued but I would
> like some reassurance.
>
>
>
> Thanks,
>
>
>
> Chad
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190724/c1e2d8d1/attachment-0001.htm>
More information about the slurm-users
mailing list