Thanks Tim, that fits with my observations. I will be back on it on the 13th and see what effects upgrading the required RPMs has.

Sid


On Sat, 3 Aug 2024, 01:41 Cutts, Tim, <tim.cutts@astrazeneca.com> wrote:

Generally speaking as a best practice I’d perform such things with no jobs running, but some upgrades you can allow without it.  Upgrading a package, even one which is currently in use by a running job, does not necessarily kill the job.  For example, upgrading a shared library won’t kill existing tasks, since they already have an open file handle on the old library version, so they will continue to use it.  New processes starting will pick up the new replacement version.  Obviously that has some risks, depending on what the job is, especially if the behaviour is different and this isn’t just a bug fix release.

 

I’ve certainly done some security patches in the past on live systems; for example upgrading openssh.  You need to take a risk based approach to it.  The lowest risk approach is to submit an exclusive job as root to drain the node, run the update and then reboot it.  But you might be waiting a long time, which is unacceptable for high severity security patches.  The higher risk is to use some other mechanism to run the update anyway; ansible, dsh, whatever your process is.

 

Can you cope with the users turning up at your desk with flaming torches and pitchforks if it goes wrong?  😊

 

Regards,

 

Tim

-- 

Tim Cutts

Scientific Computing Platform Lead

AstraZeneca

 

Find out more about R&D IT Data, Analytics & AI and how we can support you by visiting our Service Catalogue |

 

 

From: Sid Young via slurm-users <slurm-users@lists.schedmd.com>
Date: Thursday, 1 August 2024 at 1:04
AM
To: Slurm User Community List <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Upgrade node while jobs running

G'day all, 

 

I've been waiting for node to become idle before upgrading them however some jobs take a long time. If I try to remove all the packages I assume that kills the slurmstep program and with it the job.

 

Sid


AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.

This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.com