[slurm-users] restart slurmd on nodes w/ running jobs?

Paul Edmon pedmon at cfa.harvard.edu
Fri Jul 27 18:47:47 MDT 2018


Restarting slurmd should be fine assuming they come back before the 
communications time out.  I restart slurmd's all the time and haven't 
had any real problems.

-Paul Edmon-


On 7/27/2018 6:51 PM, Chris Harwell wrote:
> Ot is possible, but double check your config for timeouts first.
>
> On Fri, Jul 27, 2018, 15:31 Prentice Bisbal <pbisbal at pppl.gov 
> <mailto:pbisbal at pppl.gov>> wrote:
>
>     Slurm-users,
>
>     I'm still learning Slurm, so I have what I think is a basic question.
>     Can you restart slurmd on nodes where jobs are running, or will that
>     kill the jobs? I ran into the same problem as described here:
>
>     https://bugs.schedmd.com/show_bug.cgi?id=3535
>
>     I believe the best way to fix this is to restart slurmd on all my
>     nodes,
>     but I've been unable to determine conclusively whether I can do
>     that w/o
>     killing running jobs. I've spent some time googling this, but
>     couldn't
>     find a definitive answer one way or the other. I prefer to not kill a
>     bunch of user jobs on a Friday afternoon.
>
>     -- 
>     Prentice
>
>
> -- 
> Chris Harwell

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180727/2851b6ed/attachment.html>


More information about the slurm-users mailing list