[slurm-users] Limiting srun to a specific partition
ewan.roche at unil.ch
Tue Feb 15 10:13:01 UTC 2022
It doesn’t affect the use case of connecting via srun afterwards as no new job is submitted so the job_submit.lua logic is never called.
$ srun --pty /bin/bash
srun: error: submit_job: ERROR: interactive jobs are not allowed in the CPU or GPU partitions. Use the interactive partition
srun: error: Unable to allocate resources: Unspecified error
$ sbatch sleep.run
Submitted batch job 4624710
$ srun --jobid=4624710 --pty /bin/bash
[user at dna025 jobs]$ $ echo $TMPDIR
Division Calcul et Soutien à la Recherche
UNIL | Université de Lausanne
> On 15 Feb 2022, at 10:27, Tina Friedrich <tina.friedrich at it.ox.ac.uk> wrote:
> ...that would interfere with users 'logging in' to a job to check on it though, wouldn't it? I mean we do have pam_slurm_adopt configured but I still tell people it's preferable to use 'srun --jobid=XXXX --pty /bin/bash' to check what a specific job is doing as pam_slurm_adopt doesn't seem to inherit things like the job-local tmp dir and such :(.
> So I'd at least check if it's an existing JobID you're trying to connect to.
> On 15/02/2022 08:08, Ewan Roche wrote:
>> Hi Peter,
>> as Rémi said, the way to do this in Slurm is via a job submit plugin. For example in our job_submit.lua we have
>> if (job_desc.partition == "cpu" or job_desc.partition == "gpu") and job_desc.qos ~= "admin" then
>> if job_desc.script == nil or job_desc.script == '' then
>> slurm.log_info("slurm_job_submit: jobscript is missing, assuming interactive job")
>> slurm.log_info("slurm_job_submit: CPU/GPU partition for interactive job, abort")
>> slurm.log_user("submit_job: ERROR: interactive jobs are not allowed in the CPU or GPU partitions. Use the interactive partition")
>> return -1
>> Which checks to see if the job script exists - this is more or less the definition of an interactive job.
>> Ewan Roche
>> Division Calcul et Soutien à la Recherche
>> UNIL | Université de Lausanne
>>> On 15 Feb 2022, at 08:47, Rémi Palancher <remi at rackslab.io> wrote:
>>> Hi Peter,
>>> Le lundi 14 février 2022 à 18:37, Peter Schmidt <pschmidt at rivosinc.com> a écrit :
>>>> slurm newbie here, converting from pbspro. In pbspro there is the capability of limiting interactive jobs (i.e srun) to a specific queue (i.e partition).
>>> Note that in Slurm, srun and interactive jobs are not the same things. The srun command is for creating steps of jobs (interactive or not), optionally creating a job allocation beforehand if it does not exist.
>>> You can run interactive jobs with salloc and even attach your PTY to a running batch job to interact with it. On the other hand, batchs jobs can create steps using srun command.
>>> I don't know any native Slurm feature to restrict interactive jobs (to a specific partition or whatever). However, using job_submit LUA plugin and a custom LUA script, you might be able to accomplish what you are expecting. It has been discussed here:
>>> Rémi Palancher
>>> Rackslab: Open Source Solutions for HPC Operations
> Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator
> Research Computing and Support Services
> IT Services, University of Oxford
> http://www.arc.ox.ac.uk http://www.it.ox.ac.uk
More information about the slurm-users