[slurm-users] Is it safe to convert cons_res to cons_tres on a running system?

Nathan R Crawford nrcrawfo at uci.edu
Fri Feb 21 19:38:52 UTC 2020


Hi Chris,

  If it just requires restarting slurmctld and the slurmd processes on the
nodes, I will be happy! Can you confirm that no running or pending jobs
were lost in the transition?

Thanks,
Nate

On Thu, Feb 20, 2020 at 6:54 PM Chris Samuel <chris at csamuel.org> wrote:

> On 20/2/20 2:16 pm, Nathan R Crawford wrote:
>
> >    I interpret this as, in general, changing SelectType will nuke
> > existing jobs, but that since cons_tres uses the same state format as
> > cons_res, it should work.
>
> We got caught with just this on our GPU nodes (though it was fixed
> before I got to see what was going on) - it seems that the format of the
> RPCs changes when you go from cons_res to cons_tres and we were having
> issues until we restarted slurmd on the compute nodes as well.
>
> My memory is that this was causing issues for starting new jobs (in a
> failing completely type of manner), I'm not sure what the consequences
> were for running jobs (though I suspect it would not have been great for
> them).
>
> If Doug sees this he may remember this (he caught and fixed it).
>
> All the best,
> Chris
> --
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
>
>

-- 

Dr. Nathan Crawford              nathan.crawford at uci.edu
Director of Scientific Computing
School of Physical Sciences
164 Rowland Hall                 Office: 2101 Natural Sciences II
University of California, Irvine  Phone: 949-824-4508
Irvine, CA 92697-2025, USA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20200221/6c9d08d9/attachment-0001.htm>


More information about the slurm-users mailing list