[slurm-users] how can users start their worker daemons using srun?
Chris Samuel
chris at csamuel.org
Mon Aug 27 18:21:45 MDT 2018
On Tuesday, 28 August 2018 8:15:55 AM AEST Priedhorsky, Reid wrote:
> I am trying to figure out how to advise users on starting worker daemons in
> their allocations using srun. That is, I want to be able to run “srun foo”,
> where foo starts some child process and then exits, and the child
> process(es) persist and wait for work.
That won't happen on a well configured Slurm system as it is Slurm's role to
clear up any processes from that job left around once that job exits. This
is why cgroups and pam_slurm_adopt are so useful, you can track and kill those
off far more easily.
If you want processes to stick around you either need to ask for enough time
in the job and ensure that the script doesn't exit (and thus signal the end of
the job) until those daemons are done or you will need to find a way outside
of Slurm to do it.
One possible way for the latter would be to configure something like systemd
to allow specific users to run daemons as themselves. Then you could let
them submit a job where they do:
systemctl start --user mydaemon.service
to start it up (and check it has started successfully before exiting).
There's a bit about how to do this here (which I've just started using for a
side radio-astronomy project at the observatory I volunteer at):
https://www.brendanlong.com/systemd-user-services-are-amazing.html
Hope this helps!
All the best,
Chris
--
Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
More information about the slurm-users
mailing list