[slurm-users] Get Information from a Node to the MailProg Command / Add arbitrary information to a job

Matthias Loose m.loose at mindcode.de
Wed Jun 16 06:38:42 UTC 2021

Hi Slurm Users,

first time posting. I have a new slurm setup where the users can specify 
an amount of local node disk space they wish to use. This is a "gres" 
resource named "local" and it measures in GB. Once the user has 
scheduled a job and it gets executed, I create a folder for this job on 
the node and add a XFS project quota for this job with the requested 
amount as soft and +5% as hard limit in the node prolog. Then the users 
get this folder set as their $TMPDIR in the user prolog. Lastly I remove 
the quota and folder on job completion via the node epilog.

This all works great so far. Now I was busying myself with creating an 
email script, that would notify the users if the "local" was used up. 
Since slurm itself has no idea what the gres: local actually is and is 
only managing it as a number I have to do it myself. My thought was that 
I would check the quota on job termination in the node epilog to see 
where the quota is at, but Ive now ran into the snag on how to get this 
information to the mailprog, configured in the slurm.conf.

The arguments to that program appear to be always in this form:
   -s SLURM Job_id=327 Name=ddt_clone Ended, Run time 00:05:01, 

and the environment of the script only contains the cluster name and 
nothing else.

The question now becomes, how do I get information about the quota 
status at the end of the job from the node epilog, to the mailprog 
running on the head node. I can parse the jobID from the argument line 
to the script and thus can get all information via scontrol. So my first 
thought was if I could add my own data field to that output, it would 
solve my problem. Unfortunately I cant seem to find such an option.

Other than that Ive only come up with writing some sort of file to a 
shared storage mount that could be read by the mailprog.

Can you think of a more elegant solution to add this information to the 
job so that it can be access on the head by the mailprog with the jobid?

Any help is appreciated!

More information about the slurm-users mailing list