What is the recommended way to run longer interactive job at your systems?
Our how-to includes starting screen at front-end node and running srun with bash/zsh inside, but that indeed brings dependency between login node (with screen) and the compute node job.
On systems with multiple front-ends users need to remember the login node where they have their screen session..
Are you anybody using something more advanced and still understandable by casual user of HPC?
(I know Open On Demand, but often the use of native console has certain benefits. )
cheers
josef
Hi,
On 26/02/2024 09:27, Josef Dvoracek via slurm-users wrote:
Are you anybody using something more advanced and still understandable by casual user of HPC?
I'm not sure it qualifies but:
sbatch --wrap 'screen -D -m' srun --jobid <PREV JOB_ID> --pty screen -rd
Or: sbatch -J screen --wrap 'screen -D -m' srun --jobid $(squeue -n screen -h -o '%A') --pty screen -rd
Ward
Josef,
for us, we put a load balancer in front of the login nodes with session affinity enabled. This makes them land on the same backend node each time.
Also, for interactive X sessions, users start a desktop session on the node and then use vnc to connect there. This accommodates disconnection for any reason even for X-based apps.
Personally, I don't care much for interactive sessions in HPC, but there is a large body that only knows how to do things that way, so it is there.
Brian Andrus
On 2/26/2024 12:27 AM, Josef Dvoracek via slurm-users wrote:
What is the recommended way to run longer interactive job at your systems?
Our how-to includes starting screen at front-end node and running srun with bash/zsh inside, but that indeed brings dependency between login node (with screen) and the compute node job.
On systems with multiple front-ends users need to remember the login node where they have their screen session..
Are you anybody using something more advanced and still understandable by casual user of HPC?
(I know Open On Demand, but often the use of native console has certain benefits. )
cheers
josef
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote:
for us, we put a load balancer in front of the login nodes with session affinity enabled. This makes them land on the same backend node each time.
Hi Brian, that sounds interesting - how did you implement session affinity? cheers magnus
Magnus,
That is a feature of the load balancer. Most of them have that these days.
Brian Andrus
On 2/28/2024 12:10 AM, Hagdorn, Magnus Karl Moritz via slurm-users wrote:
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote:
for us, we put a load balancer in front of the login nodes with session affinity enabled. This makes them land on the same backend node each time.
Hi Brian, that sounds interesting - how did you implement session affinity? cheers magnus
Are most of us using HAProxy or something else?
On Wed, Feb 28, 2024 at 3:38 PM Brian Andrus via slurm-users < slurm-users@lists.schedmd.com> wrote:
Magnus,
That is a feature of the load balancer. Most of them have that these days.
Brian Andrus
On 2/28/2024 12:10 AM, Hagdorn, Magnus Karl Moritz via slurm-users wrote:
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote:
for us, we put a load balancer in front of the login nodes with session affinity enabled. This makes them land on the same backend node each time.
Hi Brian, that sounds interesting - how did you implement session affinity? cheers magnus
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
HAProxy, for on-prem things. In the cloud I just use their load balancers rather than implement my own.
Tim
-- Tim Cutts Scientific Computing Platform Lead AstraZeneca
Find out more about R&D IT Data, Analytics & AI and how we can support you by visiting our Service Cataloguehttps://azcollaboration.sharepoint.com/sites/CMU993 |
From: Dan Healy via slurm-users slurm-users@lists.schedmd.com Date: Wednesday, 28 February 2024 at 20:56 To: Brian Andrus toomuchit@gmail.com Cc: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [slurm-users] Re: [ext] Re: canonical way to run longer shell/bash interactive job (instead of srun inside of screen/tmux at front-end)? Are most of us using HAProxy or something else?
On Wed, Feb 28, 2024 at 3:38 PM Brian Andrus via slurm-users <slurm-users@lists.schedmd.commailto:slurm-users@lists.schedmd.com> wrote: Magnus,
That is a feature of the load balancer. Most of them have that these days.
Brian Andrus
On 2/28/2024 12:10 AM, Hagdorn, Magnus Karl Moritz via slurm-users wrote:
On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote:
for us, we put a load balancer in front of the login nodes with session affinity enabled. This makes them land on the same backend node each time.
Hi Brian, that sounds interesting - how did you implement session affinity? cheers magnus
-- slurm-users mailing list -- slurm-users@lists.schedmd.commailto:slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.commailto:slurm-users-leave@lists.schedmd.com
-- Thanks,
Daniel Healy ________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.comhttps://www.astrazeneca.com
Most of my stuff is in the cloud, so I use their load balancing services.
HAProxy does have sticky sessions, which you can enable based on IP so it works with other protocols: 2 Ways to Enable Sticky Sessions in HAProxy (Guide) https://www.haproxy.com/blog/enable-sticky-sessions-in-haproxy
Brian Andrus
On 2/28/2024 12:54 PM, Dan Healy wrote:
Are most of us using HAProxy or something else?
On Wed, Feb 28, 2024 at 3:38 PM Brian Andrus via slurm-users slurm-users@lists.schedmd.com wrote:
Magnus, That is a feature of the load balancer. Most of them have that these days. Brian Andrus On 2/28/2024 12:10 AM, Hagdorn, Magnus Karl Moritz via slurm-users wrote: > On Tue, 2024-02-27 at 08:21 -0800, Brian Andrus via slurm-users wrote: >> for us, we put a load balancer in front of the login nodes with >> session >> affinity enabled. This makes them land on the same backend node each >> time. > Hi Brian, > that sounds interesting - how did you implement session affinity? > cheers > magnus > > -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
-- Thanks,
Daniel Healy
On 26/2/24 12:27 am, Josef Dvoracek via slurm-users wrote:
What is the recommended way to run longer interactive job at your systems?
We provide NX for our users and also access via JupyterHub.
We also have high priority QOS's intended for interactive use for rapid response, but they are capped at 4 hours (or 6 hours for Jupyter users).
All the best, Chris