[slurm-users] how can users start their worker daemons using srun?

Chris Samuel chris at csamuel.org
Mon Aug 27 18:21:45 MDT 2018


On Tuesday, 28 August 2018 8:15:55 AM AEST Priedhorsky, Reid wrote:

> I am trying to figure out how to advise users on starting worker daemons in
> their allocations using srun. That is, I want to be able to run “srun foo”,
> where foo starts some child process and then exits, and the child
> process(es) persist and wait for work.

That won't happen on a well configured Slurm system as it is Slurm's role to 
clear up any processes from that job left around once that job exits.   This 
is why cgroups and pam_slurm_adopt are so useful, you can track and kill those 
off far more easily.

If you want processes to stick around you either need to ask for enough time 
in the job and ensure that the script doesn't exit (and thus signal the end of 
the job) until those daemons are done or you will need to find a way outside 
of Slurm to do it.

One possible way for the latter would be to configure something like systemd 
to allow specific users to run daemons as themselves.   Then you could let 
them submit a job where they do:

systemctl start --user mydaemon.service

to start it up (and check it has started successfully before exiting).

There's a bit about how to do this here (which I've just started using for a 
side radio-astronomy project at the observatory I volunteer at):

https://www.brendanlong.com/systemd-user-services-are-amazing.html

Hope this helps!

All the best,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC






More information about the slurm-users mailing list