[slurm-users] restart slurmd on nodes w/ running jobs?
Paul Edmon
pedmon at cfa.harvard.edu
Fri Jul 27 18:47:47 MDT 2018
Restarting slurmd should be fine assuming they come back before the
communications time out. I restart slurmd's all the time and haven't
had any real problems.
-Paul Edmon-
On 7/27/2018 6:51 PM, Chris Harwell wrote:
> Ot is possible, but double check your config for timeouts first.
>
> On Fri, Jul 27, 2018, 15:31 Prentice Bisbal <pbisbal at pppl.gov
> <mailto:pbisbal at pppl.gov>> wrote:
>
> Slurm-users,
>
> I'm still learning Slurm, so I have what I think is a basic question.
> Can you restart slurmd on nodes where jobs are running, or will that
> kill the jobs? I ran into the same problem as described here:
>
> https://bugs.schedmd.com/show_bug.cgi?id=3535
>
> I believe the best way to fix this is to restart slurmd on all my
> nodes,
> but I've been unable to determine conclusively whether I can do
> that w/o
> killing running jobs. I've spent some time googling this, but
> couldn't
> find a definitive answer one way or the other. I prefer to not kill a
> bunch of user jobs on a Friday afternoon.
>
> --
> Prentice
>
>
> --
> Chris Harwell
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180727/2851b6ed/attachment.html>
More information about the slurm-users
mailing list