Paul Edmon via slurm-users slurm-users@lists.schedmd.com writes:
Its definitely working for 23.11.8, which is what we are using.
It turns out we had unintentionally started firewalld on the login node. Now this has been turned off, 'salloc' drops into a shell on a compute node as desired.
Thanks for all the data points.
Cheers,
Loris
-Paul Edmon-
On 9/5/24 10:22 AM, Loris Bennett via slurm-users wrote:
Jason Simms via slurm-users slurm-users@lists.schedmd.com writes:
Ours works fine, however, without the InteractiveStepOptions parameter.
My assumption is also that default value should be OK.
It would be nice if some one could confirm that 23.11.10 was working for them. However, we'll probably be upgrading to 24.5 fairly soon, and so we shall see whether the issue persists.
Cheers,
Loris
JLS
On Thu, Sep 5, 2024 at 9:53 AM Carsten Beyer via slurm-users slurm-users@lists.schedmd.com wrote:
Hi Loris,
we use SLURM 23.02.7 (Production) and 23.11.1 (Testsystem). Our config contains a second parameter InteractiveStepOptions in slurm.conf:
InteractiveStepOptions="--interactive --preserve-env --pty $SHELL -l" LaunchParameters=enable_nss_slurm,use_interactive_step
That works fine for us:
[k202068@levantetest ~]$ salloc -N1 -A k20200 -p compute salloc: Pending job allocation 857 salloc: job 857 queued and waiting for resources salloc: job 857 has been allocated resources salloc: Granted job allocation 857 salloc: Waiting for resource configuration salloc: Nodes lt10000 are ready for job [k202068@lt10000 ~]$
Best Regards, Carsten
Am 05.09.24 um 14:17 schrieb Loris Bennett via slurm-users:
Hi,
With
$ salloc --version slurm 23.11.10
and
$ grep LaunchParameters /etc/slurm/slurm.conf LaunchParameters=use_interactive_step
the following
$ salloc --partition=interactive --ntasks=1 --time=00:03:00 --mem=1000 --qos=standard salloc: Granted job allocation 18928869 salloc: Nodes c001 are ready for job
creates a job
$ squeue --me JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 18928779 interacti interact loris R 1:05 1 c001
but causes the terminal to block.
From a second terminal I can log into the compute node:
$ ssh c001 [13:39:36] loris@c001 (1000) ~
Is that the expected behaviour or should salloc return a shell directly on the compute node (like srun --pty /bin/bash -l used to do)?
Cheers,
Loris
-- Carsten Beyer Abteilung Systeme
Deutsches Klimarechenzentrum GmbH (DKRZ) Bundesstraße 45a * D-20146 Hamburg * Germany
Phone: +49 40 460094-221 Fax: +49 40 460094-270 Email: beyer@dkrz.de URL: http://www.dkrz.de
Geschäftsführer: Prof. Dr. Thomas Ludwig Sitz der Gesellschaft: Hamburg Amtsgericht Hamburg HRB 39784
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
-- Jason L. Simms, Ph.D., M.P.H. Manager of Research Computing Swarthmore College Information Technology Services (610) 328-8102 Schedule a meeting: https://calendly.com/jlsimms