- slurm-users - lists.schedmd.com

Implementing a "soft" wall clock limit
by Davide DelVento 24 Jun '25

24 Jun '25

In the institution where I work, so far we have managed to live without mandatory wallclock limits (a policy decided well before I joined the organization), and that has been possible because the cluster was not very much utilized. Now that is changing, with more jobs being submitted and those being larger ones. As such I would like to introduce wallclock limits to allow slurm to be more efficient in scheduling jobs, including with backfill. My concern is that this user base is not used to it and therefore I want to make it easier for them, and avoid common complaints. I anticipate one of them would be "my job was cancelled even though there were enough nodes idle and no other job in line after mine" (since the cluster utilization is increasing, but not yet always full like it has been at most other places I know). So my question is: is it possible to implement "soft" wallclock limits in slurm, namely ones which would not be enforced unless necessary to run more jobs? In other words, is it possible to change the pre-emptability of a job only after some time has passed? I can think of some ways to hack this functionality myself with some cron or at jobs, and that might be easy enough to do, but I am not sure I can make it robust enough to cover all situations, so I'm looking for something either slurm-native or (if external solution) field-tested by someone else already, so that at least the worst kinks have been already ironed out. Thanks in advance for any suggestions you may provide!

7 14

enforce Qos to users
by laddaoui＠telecom-paris.fr 24 Jun '25

24 Jun '25

Hello everyone, I'm trying to use QoS to enforce resource limits on an association, but I'm having trouble with proper enforcement. I created a QoS with resource limits: ``` sacctmgr add qos qos_gpus flags=denyonlimit,overpartqos maxjobsperuser=4 maxtresperjob=gres/gpu=1 ``` Then I assigned it to an account: ``` sacctmgr modify account name=account-a set qos=qos_gpus defaultqos=qos_gpus systemctl restart slurmctld ``` Users in this account can bypass the QoS limits by explicitly specifying a different QoS when submitting jobs: ``` srun --qos=(normal|qos_gpus) ... ``` Even though I set `defaultqos=qos_gpus`, users can still choose any available QoS and bypass the intended resource limits. My question is: How can I restrict users to only using their assigned QoS and prevent them from specifying other QoS options? Is there a configuration I'm missing to enforce QoS restrictions properly? Best, --- info about my setup slurm version : tested on 23.11.4 and 23.02.7 AccountingStorageEnforce = associations,limits EnforcePartLimits = ALL

2 2

pam_slurm_adopt - ssh to compute nodes not working in slurm 24.11
by Marx, Wolfgang 23 Jun '25

23 Jun '25

Hi, We have defined in our cluster, that users can logon to a compute node from a login node,when they have actual a job running on the compute node. To get this functionallity working, we are usinge the pam_slurm_adopt. As long as we were using slurm 23.05 it all was working well. Now we have upgraded to slurm 24.11.5 and the ssh login to the compute nodes is not longer working. When a job of a user is running on a compute node the ssh of the user to this compute node is refused form the compute node. We have not changed anything in our configuration for the pam_slurm_adopt. Also there is no indication in the release notes that anything has changed regarding pam_slurm_adopt. Is this a known bug in Slurm 24.11 and has anyone facing the same problem. This is a very important feature, especially for our ANSYS users. Thanks Wolfgang Marx Wolfgang Marx, Basisdienste, Gruppe Hochleistungrechnen Technische Universität Darmstadt, Hochschulrechenzentrum Alexanderstraße 2, 64283 Darmstadt Tel.: +496151/16-71158 E-Mail: wolfgang.marx(a)tu-darmstadt.de Web: www.hrz.tu-darmstadt.de

2 1

read-only slurm user
by Hagdorn, Magnus Karl Moritz 23 Jun '25

23 Jun '25

Hi there, we use the slurm prometheus exporter to collect slurm metrics. This works pretty well. However, we have noticed that metrics for some of the restricted partitions are not collected. It occurred to me that this is because we are using an unprivileged user to run the exporter. I am trying to figure out the best way to allow an unprivileged user to collect all metrics. I could add the user to all the relevant groups. However, I am also thinking of using the new slurm exporter that uses the API in which case I need to somehow handle a token. It would be nice to have a readonly user, ie a user that cannot submit any jobs but only read the current state of the cluster. I guess setting MaxJobs and MaxSubmitJobs to 0 would do this. Any other suggestions? Regards magnus -- Dr. Magnus Hagdorn Charité – Universitätsmedizin Berlin Geschäftsbereich IT | Scientific Computing Campus Charité Mitte BALTIC - Invalidenstraße 120/121 10115 Berlin https://www.charite.de HPC Helpdesk: sc-hpc-helpdesk(a)charite.de

1 0

Doc Clarification: Heterogeneous Steps in Heterogeneous Job
by Steffen Christgau 19 Jun '25

19 Jun '25

Hi everybody, I am (along with others) a little bit puzzled by the meaning of a statement in the documentation concerning heterogeneous job steps inside het. jobs. The docs state (https://slurm.schedmd.com/archive/slurm-24.11.5/heterogeneous_jobs.html#het…): > You also cannot request heterogeneous steps from within a heterogeneous job. (A) On a very small Slurm test installation with just two nodes, the following het job that requests het steps (does it, right?!) runs fine: $ cat hetjob-steps.sh #!/bin/bash #SBATCH --mem-per-cpu=2g --nodes=1 --cpus-per-task=8 #SBATCH hetjob #SBATCH --mem-per-cpu=1g --nodes=1 --cpus-per-task=4 srun -l --cpus-per-task=4 nproc : -l --cpus-per-task=2 nproc $ cat slurm-125.out 1: 4 2: 2 3: 2 0: 4 The output looks reasonable and it looks like the above quote does not apply since one can apparently request het steps in a het job. Or am I wrong? The intro in the respective section also gives the impression that het jobsteps are a convenience feature that does not require het jobs, but it does not explicitly exclude the usage of het steps in het jobs: > Slurm version 20.11 introduces the ability to request heterogeneous job steps from within a non-homogeneous job allocation. This allows you the flexibility to have different layouts for job steps without requiring the use of heterogeneous jobs, where having separate jobs for the components may be undesirable. So what does the initial statement (A) actually mean then? Am I just using a lucky example which is actually not supported? A short clarification would be helpful. Thanks in advance Steffen

1 0

Wrong MaxRSS Behavior with cgroup v2 in Slurm
by Guillaume COCHARD 19 Jun '25

19 Jun '25

Hello, We've noticed a recent change in how MaxRSS is reported on our cluster. Specifically, the MaxRSS value for many jobs now often matches the allocated memory, which was not the case previously. It appears this change is due to how Slurm accounts for memory when copying large files, likely as a result of moving from cgroup v1 to cgroup v2. Here’s a simple example: copy_file.sh #!/bin/bash cp /distributed/filesystem/file5G /tmp cp /tmp/file5G ~ Two jobs with different memory allocations: Job 1 sbatch -c 1 --mem=1G copy_file.sh seff <jobid> Memory Utilized: 1021.87 MB Memory Efficiency: 99.79% of 1.00 GB Job 2 sbatch -c 1 --mem=10G copy_file.sh seff <jobid> Memory Utilized: 4.02 GB Memory Efficiency: 40.21% of 10.00 GB With cgroup v1, this script typically showed minimal memory usage. Now, under cgroup v2, memory usage appears inflated and depends on the allocated memory, which seems wrong. I believe this behavior aligns with similar issues raised by the Kubernetes community [1], and is consistent with how memory.current behaves in cgroup v2 [3]. According to Slurm’s documentation about cgroup v2, "this plugin provides cgroup's memory.current value from the memory interface, which is not equal to the RSS value provided by procfs. Nevertheless it is the same value that the kernel uses in its OOM killer logic." [2] While technically correct, this seems to mark a significant change in what MaxRSS and "Memory Efficiency" actually measure and renders those metrics almost useless. Our Configuration: ProctrackType=proctrack/cgroup TaskPlugin=task/cgroup,task/affinity Question: Is there a way to restore more realistic MaxRSS values — specifically, ones that exclude file-backed page cache — while still using cgroup v2? Thanks, Guillaume References: [1] https://github.com/kubernetes/kubernetes/issues/118916 [2] https://slurm.schedmd.com/cgroup_v2.html#limitations [3] https://facebookmicrosites.github.io/cgroup2/docs/memory-controller.html

3 4

Re: Job information if job is completed
by Gestió Servidors 18 Jun '25

18 Jun '25

Hi, After reconfiguring slurm.conf to add script into the database, I have tried with a "normal" user to get that information (from a owned finished job). However, when I run "sacct -vvvv -B -j 92656", I get: sacct: Jobs Eligible in the time window from Epoch 0 to Wed Jun 18 15:33:03 2025 sacct: debug: Options selected: opt_completion=no opt_dup=no opt_field_list=User,JobID,Jobname%18,partition,state,time,submit,start,end,elapsed,nnodes,ncpus,nodelist, opt_help=0 opt_no_steps=yes opt_whole_hetjob=(null) sacct: debug: accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg sacct: debug2: Clusters requested: mycluster sacct: debug2: Userids requested: all sacct: debug2: Jobs requested: sacct: debug2: : 92656 sacct: error: Unknown error 1064 I have read that "sacct: error: Unknown error 1064" could be a error in the MySQL query syntax... but if I run same command "sacct -vvvv -B -j 92656" as root, I get submit script. I suppose problem is in database permissions, so I have added new permissions with "grant all on slurmdb.* TO 'user-sacct'@'localhost' identified by '..user-sacct..' with grant option;" (where user "user-sacct" is a user that acts as coordinator in SLURM). However, if I open a terminal with "user-sacct" and run "sacct -vvvv -B -j 92656", result is the same: "sacct: error: Unknown error 1064" How could give permissions to user-sacct to allow "sacct -B" command? Thanks.

1 0

Re: Job information if job is completed
by Gestió Servidors 18 Jun '25

18 Jun '25

Hello, So if there is no way to get all information from a finished job (specially submit script, not command line, but all content from submit script, like a copy of it), maybe a solution would be run a "prolog" script in each job to run a "cp" from the submit script. However, how could I copy the submit scrip from the prolog script? Because from the prolog script, I can access to some SLURM variables (https://slurm.schedmd.com/prolog_epilog.html) but I don't know how to know what the script is and run a simple "cp" to a destination folder. Thanks.

4 4

Re: [EXT] Re: slurm_pam_adopt module not working
by Ratnasamy, Fritz 17 Jun '25

17 Jun '25

Yes the file exists in /usr/lib64/security/. Best, *Fritz Ratnasamy*Data Scientist Information Technology On Tue, Jun 17, 2025 at 12:17 AM Sean Crosby <scrosby(a)unimelb.edu.au> wrote: > Hi Fritz, > > Does pam_slurm_adopt.so exist in the right location on the node? Normally > on EL hosts it would be /usr/lib64/security/pam_slurm_adopt.so > > # ls /usr/lib64/security/pam_slurm_adopt.so -la > -rwxr-xr-x 1 root root 291936 Mar 4 12:44 > /usr/lib64/security/pam_slurm_adopt.so > > If the file doesn't exist, pam would abnormally exit and not allow anyone > to log in. > > Sean > ------------------------------ > *From:* Ratnasamy, Fritz via slurm-users <slurm-users(a)lists.schedmd.com> > *Sent:* Tuesday, 17 June 2025 14:55 > *To:* Kevin Buckley <kevin.buckley.pawsey.org.au(a)gmail.com> > *Cc:* slurm-users(a)lists.schedmd.com <slurm-users(a)lists.schedmd.com> > *Subject:* [EXT] [slurm-users] Re: slurm_pam_adopt module not working > > * External email: Please exercise caution * > ------------------------------ > Thanks, for some reason I edited the /etc/pam.d/sshd via ansible but that > locked all users to the cluster. That same file works on a different > cluster where the files are pushed via puppet but with ansible it looks > like it is locking all users to the cluster. See below config file sshd: > > auth required pam_sepermit.so > auth substack password-auth > auth include postlogin > # Used with polkit to reauthorize users in remote sessions > -auth optional pam_reauthorize.so prepare > account required pam_nologin.so > ##SLURM > account sufficient pam_slurm_adopt.so action_no_jobs=deny > action_unknown=newest action_adopt_failure=deny action_generic_failure=deny > account sufficient pam_access.so > ##END SLURM > password include password-auth > # pam_selinux.so close should be the first session rule > session required pam_selinux.so close > session required pam_loginuid.so > # pam_selinux.so open should only be followed by sessions to be executed > in the user context > session required pam_selinux.so open env_params > session required pam_namespace.so > session optional pam_keyinit.so force revoke > session include password-auth > session include postlogin > # Used with polkit to reauthorize users in remote sessions > -session optional pam_reauthorize.so prepare > > > > *Fritz Ratnasamy *Data Scientist > Information Technology > > > > > On Wed, Jun 11, 2025 at 8:29 PM Kevin Buckley via slurm-users < > slurm-users(a)lists.schedmd.com> wrote: > > On 2025/06/11 12:46, Ratnasamy, Fritz via slurm-users wrote: > > > > We wanted to block users from ssh to a node where there are no jobs > > running however it looks like users are able to do so. We have installed > > the slurm_pam_adopt_module and set up the slurm.conf accordingly (the > same > > way we did on our first cluster where the pam module denies ssh access > > correctly). > > We saw a similar issue whereby the way that we had PAM setup, meant > that, and here I quote from SchedMD's Daniel Armengod: > > ----8<--------8<--------8<--------8<--------8<--------8<--------8<---- > This is almost certainly caused by the fact that SSH's > `keyboard-interactive` > (not to be confused with `password`) AuthMethod forks a short-lived child > process that is involved in the authentication logic. Slurm's > pam_slurm_adopt > module latches on to that process (which is the wrong one, of course) and > things break in interesting ways from there. > > SSH authmethods `publickey` and `password` do not exhibit this behaviour > as SSH > does not fork a child process to offload the authentication > challenge-response > dialogue to. > > ... > > The key bit here is that in your last test you're forcing > `PreferredAuthentications=password`, which isn't actually the > `keyboard-interactive` AuthMethod that got picked before. > They work differently under the hood, even if as far as the > user is concerned, both methods just ask for a password. > > ... > > In summary: try disabling the `keyboard-interactive` authentication method > in > your compute nodes. pam_slurm_adopt should work correctly now. > ----8<--------8<--------8<--------8<--------8<--------8<--------8<---- > > Maybe that's also your issue. > > > Daniel did say that SchedMD were going to update their documentation > to make that distinction, and it's effect, more explciit, so I would > expect it to be in the mainstream docs by now. > > HTH > > -- > slurm-users mailing list -- slurm-users(a)lists.schedmd.com > To unsubscribe send an email to slurm-users-leave(a)lists.schedmd.com > CAUTION: This email has originated outside of University email systems. > Please do not click links or open attachments unless you recognize the > sender and trust the contents as safe. > > CAUTION: This email has originated outside of University email systems. > Please do not click links or open attachments unless you recognize the > sender and trust the contents as safe. > >

2 1

Re: slurm_pam_adopt module not working
by Ratnasamy, Fritz 17 Jun '25

17 Jun '25

Thanks, for some reason I edited the /etc/pam.d/sshd via ansible but that locked all users to the cluster. That same file works on a different cluster where the files are pushed via puppet but with ansible it looks like it is locking all users to the cluster. See below config file sshd: auth required pam_sepermit.so auth substack password-auth auth include postlogin # Used with polkit to reauthorize users in remote sessions -auth optional pam_reauthorize.so prepare account required pam_nologin.so ##SLURM account sufficient pam_slurm_adopt.so action_no_jobs=deny action_unknown=newest action_adopt_failure=deny action_generic_failure=deny account sufficient pam_access.so ##END SLURM password include password-auth # pam_selinux.so close should be the first session rule session required pam_selinux.so close session required pam_loginuid.so # pam_selinux.so open should only be followed by sessions to be executed in the user context session required pam_selinux.so open env_params session required pam_namespace.so session optional pam_keyinit.so force revoke session include password-auth session include postlogin # Used with polkit to reauthorize users in remote sessions -session optional pam_reauthorize.so prepare *Fritz Ratnasamy*Data Scientist Information Technology On Wed, Jun 11, 2025 at 8:29 PM Kevin Buckley via slurm-users < slurm-users(a)lists.schedmd.com> wrote: > On 2025/06/11 12:46, Ratnasamy, Fritz via slurm-users wrote: > > > > We wanted to block users from ssh to a node where there are no jobs > > running however it looks like users are able to do so. We have installed > > the slurm_pam_adopt_module and set up the slurm.conf accordingly (the > same > > way we did on our first cluster where the pam module denies ssh access > > correctly). > > We saw a similar issue whereby the way that we had PAM setup, meant > that, and here I quote from SchedMD's Daniel Armengod: > > ----8<--------8<--------8<--------8<--------8<--------8<--------8<---- > This is almost certainly caused by the fact that SSH's > `keyboard-interactive` > (not to be confused with `password`) AuthMethod forks a short-lived child > process that is involved in the authentication logic. Slurm's > pam_slurm_adopt > module latches on to that process (which is the wrong one, of course) and > things break in interesting ways from there. > > SSH authmethods `publickey` and `password` do not exhibit this behaviour > as SSH > does not fork a child process to offload the authentication > challenge-response > dialogue to. > > ... > > The key bit here is that in your last test you're forcing > `PreferredAuthentications=password`, which isn't actually the > `keyboard-interactive` AuthMethod that got picked before. > They work differently under the hood, even if as far as the > user is concerned, both methods just ask for a password. > > ... > > In summary: try disabling the `keyboard-interactive` authentication method > in > your compute nodes. pam_slurm_adopt should work correctly now. > ----8<--------8<--------8<--------8<--------8<--------8<--------8<---- > > Maybe that's also your issue. > > > Daniel did say that SchedMD were going to update their documentation > to make that distinction, and it's effect, more explciit, so I would > expect it to be in the mainstream docs by now. > > HTH > > -- > slurm-users mailing list -- slurm-users(a)lists.schedmd.com > To unsubscribe send an email to slurm-users-leave(a)lists.schedmd.com > CAUTION: This email has originated outside of University email systems. > Please do not click links or open attachments unless you recognize the > sender and trust the contents as safe. > >

2 1

2025

2024

slurm-users ----- 2025 ----- July 2025 June 2025 May 2025 April 2025 March 2025 February 2025 January 2025 ----- 2024 ----- December 2024 November 2024 October 2024 September 2024 August 2024 July 2024 June 2024 May 2024 April 2024 March 2024 February 2024 January 2024

slurm-users