[slurm-users] Problem with sbatch
wagner at itc.rwth-aachen.de
Tue Jul 9 13:33:41 UTC 2019
I strongly recommend to let SlurmdUser be root.
slurmd starts slurmstepd, but without root privileges, as the specific
user. That is the program, that actually executes the jobscript.
But slurmd needs to bee root, e.g. to execute prolog and epilog scripts,
which in many cases need root access, e.g. to mount some filesystem for
the user, create temp dirs for the user and jobid, delete them in the
epilog again and so on.
sbatch does not need root, it is a userspace program to submit a
batchjob. It communicates with the slurmctld, which in fact should not
run as root. As long as the slurmctlduser is not root, slurmctld drops
its privileges itself after start.
srun does (sometimes) two or even more things.
You could use salloc to get an allocation on some nodes and then use
srun to execute some code on the allocated nodes
srun alone on a login node creates an allocation (like with salloc) and
then runs the code on the allocated nodes.
It is also used to run code from within the batchscript on the allocated
Attempt to submit and/or run a job as user instead of the
invoking user id. The invoking user's credentials will be used to check
access permissions for the target partition. User root may use this
option to run jobs as
a normal user in a RootOnly partition for example. If run
as root, sbatch will drop its permissions to the uid specified after
node allocation is successful. user may be the user name or numerical
First of all, it is '--uid', not '-uid'
Slurm silently ignores in batchscripts wrongly typed parameters. It also
ignores parameters, if there are any non comment lines before the
second, --uid is ONLY allowed for root.
you should look at the output of "scontrol show job <jobid>" if really
was submitted as another uid.
On 7/9/19 11:16 AM, Daniel Torregrosa wrote:
> Thanks a lot for the answers!
> So, if I understand this correctly, for some reason, `srun` does not
> need root privileges on the computation node side, but `sbatch` does
> when scheduling. I was afraid doing so would mean users could do
> things such as apt install and such, but it does not seem the case.
> I am not going to be managing the actual cluster, only exploring
> possibilities. At this point I am mostly convinced slurmdUser=sudo is
> safe, so that is one less potential problem.
> Maybe I should open a new thread, but, for some reason, when I submit
> #! /bin/bash
> #SBATCH -J myjob
> #SBATCH -uid test
> the execution silently fails, and the log in the computation node says
> "/home/test/d" does not exist. According to the documentation, -uid is
> intended for sudo to emulate sending jobs as different users, but the
> behaviour is a bit odd...
> @Patrick: I do not know how to do that. I only know that I can make
> slurm sudoer and NOPASSWD, but slurm would still call to `chown` (not
> `sudo chown`). An alternative would be replacing `chown` with a small
> script that calls `sudo chown`, but that is likely to break a lot of
> stuff. I assume slurmd will also need other root-only commands to work.
> @Michael Indeed, the documentation/tutorials often mention that
> SlurmdUser should be root, but it is not clearly explained why
> anywhere (e.g. https://slurm.schedmd.com/quickstart_admin.html section
> Daemons). It seems that `srun whoami` returns the current user (and
> not root), so even when slurmdUser is root, users do not have
> privileges, so in principle there is no problem at all.
> @Jeffrey It is expected to be multi-user. As for your third option, I
> think you refer to something similar to what I wrote for Patrick.
Marcus Wagner, Dipl.-Inf.
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
More information about the slurm-users