[slurm-users] SlurmdSpoolDir
Kamil Wilczek
kmwil at mimuw.edu.pl
Tue Aug 16 19:10:01 UTC 2022
Maybe this was a noob question, I've just solved my problem.
I'll share my thoughts. I returned to my original settings
and rerun Ansible's playbook, reconfiguring the SlurmdSpoolDir.
* https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmdSpoolDir_1
Maybe it is writable by root, because root can write everywhere
(at least except for the Kerberos'ed NFS), but my settings give
he user and group permissions only to the "slurm" user:
- name: "create SlurmdSpoolDir directory"
ansible.builtin.file:
path: "{{ slurmd_spool_dir }}"
state: "directory"
owner: "{{ slurm_user }}"
group: "{{ slurm_user }}"
mode: "0770"
* Setting permissions for the SlurmSpoolDir is not really important,
because at each "slurmd" reboot those permissions are reset
by "slurmd" to "0755". The ownership is not changed. So, as a result:
drwxr-xr-x 2 slurm slurm 74 Aug 16 20:57 slurmd_spool
* The missing part was the read permissions for "other" for the
SlurmSpoolDir's parent directory. I had to set "775" instead
of "770" for the parent dir, which in my case is
"/opt/slurm_state_dir"
drwxrwxr-x 3 slurm slurm 26 Aug 11 19:49 slurm_state_dir
Kind regards
--
Kamil Wilczek
W dniu 16.08.2022 o 18:00, Kamil Wilczek pisze:
> Dear Slurm Users,
>
> recently, I have started a new instance of my cluster with Slurm 22.05.2
> (built from source). Evertyhing seems to be configured properly and
> working fine except "sbatch". The error is quite self-explanatory and
> I thought it would be quite easy to fix directory permissions.
>
> slurmstepd: error: execve():
> /opt/slurm_state_dir/slurmd_spool/job00136/slurm_script: Permission denied
>
> I read here
> (https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmdSpoolDir_1) that
> the directory should
> be writable by root. I did that, but it did not help. I tried
> several other combinations of permissions, no improvement.
>
> Currently:
>
> # ls -l /opt/
> drwxrwx--- 3 slurm root 26 Aug 11 19:49 slurm_state_dir
>
> # tree -pug /opt/slurm_state_dir
> /opt/slurm_state_dir
> └── [drwxrwx--- root slurm ] slurmd_spool
> ├── [-rw------- root root ] cred_state
> ├── [-rw------- root root ] cred_state.old
> └── [-rw-r--r-- root root ] hwloc_topo_whole.xml
>
> Additionaly, when I change the mode of the slurmd_spool directory,
> for example to 770, restarting the slurmd service changes them
> back to 755 irrespectively of user/group.
>
> Could somoeone tell what the correct settings should be?
> I did not have such problems in using 19.05.
>
> Kind Regards
More information about the slurm-users
mailing list