[slurm-users] Problem with sbatch

Daniel Torregrosa daniel.torregrosa at insight-centre.org
Mon Jul 8 16:37:20 UTC 2019


You are right. The critical part I was missing is that chown does not work
without sudo.

I assume this can be fixed by modifying the configuration so
"SlurmdUser=root", but does this imply that anything run with `srun` will
be actually executed by root? This seems dangerous.

Thanks a lot.

On Mon, 8 Jul 2019 at 17:28, Jeffrey Frey <frey at udel.edu> wrote:

> Does user "slurm" have the capability of reowning files/directories to an
> arbitrary uid/gid?  Probably not -- that's something "root" can do, though.
>
>
>
>
> > On Jul 8, 2019, at 12:01 PM, Daniel Torregrosa <
> daniel.torregrosa at insight-centre.org> wrote:
> >
> > Hi all,
> >
> > I am currently testing slurm (slurm-wlm 17.11.2 from a newly installed
> and updated Ubuntu server LTS). I managed to make it work on a very simple
> 1 master node and 2 compute nodes configuration. All three nodes have the
> same users (namely root, slurm and test), with slurm running both slurmctld
> and slurmd on the corresponding node (i.e. SlurmUser=slurm and
> SlurmdUser=slurm), and test as the only loggable user.
> >
> > Commands such as `salloc` and `srun` work perfectly, but `sbatch` fails.
> In `squeue`, I get "(launch failed requeued help)". When I check the
> corresponding compute node log, I get "error:
> chown(/var/spool/slurmd/d/jobxxxxx): Operation not permitted". The previous
> line has "Launching batch job xx for UID 1000" (test) or 0 (root) if
> running `sudo sbatch`.
> >
> > Batch file looks like
> >
> > #! /bin/bash
> > #SBATCH -J myjob
> >
> > hostname
> >
> > I suspect that the problem is that `srun` and `salloc` are being run by
> SlurmdUser (slurm, i.e. `srun whoami` returns slurm), who owns
> /var/spool/slurmd, but sbatch tasks are being run by the user issuing the
> command (test).
> >
> > Should I chmod /var/spool/slurmd so any user can write there, or do I
> have a configuration problem? I feel like I am missing something critical
> here.
> >
> > Thanks a lot.
> > Daniel
>
>
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
> Jeffrey T. Frey, Ph.D.
> Systems Programmer V / HPC Management
> Network & Systems Services / College of Engineering
> University of Delaware, Newark DE  19716
> Office: (302) 831-6034  Mobile: (302) 419-4976
> ::::::::::::::::::::::::::::::::::::::::::::::::::::::
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190708/929a3c91/attachment.htm>


More information about the slurm-users mailing list