Hello!
Our jobs can ask for dedicated per-node disk space, e.g. "--gres=tmp:1G", where an ephemeral directory is managed by the site prolog/epilog and usage is capped using an xfs project quota. This works well, although we really need to look at job_container/tmpfs.
I note that slurm already periodically polls jobs for their memory usage, which finds its way into the accounting database tres_usage fields.
What would be really nice is if we could extend this polling so that we feed our xfs project quota utilisation into the accounting database, too: users get feedback on what their jobs need, and we notice people who ask for a fraction of a node and all of the disk space, but not use it.
Slurm is famous for being extendible. Is there some sort of plugin hook someone can point me at to have a go with, or am I missing a simpler method, please?
Thanks!
Mark