Hi Magnus,
thanks for your reply! If you can, would you mind sharing the InitScript of your attempt at getting it to work?
Best,
Tim
On 06.02.24 15:19, Hagdorn, Magnus Karl Moritz wrote:
Hi Tim, we are using the container/tmpfs plugin to map /tmp to a local NVMe drive which works great. I did consider setting up directory quotas. I thought the InitScript [1] option should do the trick. Alas, I didn't get it to work. If I remember correctly, slurm complained about the option being present. In the end we recommend our users to make exclusive use a node if they are going to use a lot of local scratch space. I don't think this happens very often if at all. Regards magnus
[1] https://slurm.schedmd.com/job_container.conf.html#OPT_InitScript
On Tue, 2024-02-06 at 14:39 +0100, Tim Schneider via slurm-users wrote:
Hi,
In our SLURM cluster, we are using the job_container/tmpfs plugin to ensure that each user can use /tmp and it gets cleaned up after them. Currently, we are mapping /tmp into the nodes RAM, which means that the cgroups make sure that users can only use a certain amount of storage inside /tmp.
Now we would like to use of the node's local SSD instead of its RAM to hold the files in /tmp. I have seen people define local storage as GRES, but I am wondering how to make sure that users do not exceed the storage space they requested in a job. Does anyone have an idea how to configure local storage as a proper tracked resource?
Thanks a lot in advance!
Best,
Tim
Hi Tim, in the end the InitScript didn't contain anything useful because
slurmd: error: _parse_next_key: Parsing error at unrecognized key: InitScript
At this stage I gave up. This was with SLURM 23.02. My plan was to setup the local scratch directory with XFS and then get the script to apply a project quota, ie quota attached to the directory.
I would start by checking if slurm recognises the InitScript option.
Regards magnus
On Tue, 2024-02-06 at 15:24 +0100, Tim Schneider wrote:
Hi Magnus,
thanks for your reply! If you can, would you mind sharing the InitScript of your attempt at getting it to work?
Best,
Tim
On 06.02.24 15:19, Hagdorn, Magnus Karl Moritz wrote:
Hi Tim, we are using the container/tmpfs plugin to map /tmp to a local NVMe drive which works great. I did consider setting up directory quotas. I thought the InitScript [1] option should do the trick. Alas, I didn't get it to work. If I remember correctly, slurm complained about the option being present. In the end we recommend our users to make exclusive use a node if they are going to use a lot of local scratch space. I don't think this happens very often if at all. Regards magnus
[1] https://slurm.schedmd.com/job_container.conf.html#OPT_InitScript
On Tue, 2024-02-06 at 14:39 +0100, Tim Schneider via slurm-users wrote:
Hi,
In our SLURM cluster, we are using the job_container/tmpfs plugin to ensure that each user can use /tmp and it gets cleaned up after them. Currently, we are mapping /tmp into the nodes RAM, which means that the cgroups make sure that users can only use a certain amount of storage inside /tmp.
Now we would like to use of the node's local SSD instead of its RAM to hold the files in /tmp. I have seen people define local storage as GRES, but I am wondering how to make sure that users do not exceed the storage space they requested in a job. Does anyone have an idea how to configure local storage as a proper tracked resource?
Thanks a lot in advance!
Best,
Tim