[slurm-users] Job completed but child process still running

Juergen Salk juergen.salk at uni-ulm.de
Mon Jan 13 16:44:27 UTC 2020


* Chris Samuel <chris at csamuel.org> [200113 07:30]:
> On 1/13/20 5:55 am, Youssef Eldakar wrote:
> 
> > In an sbatch script, a user calls a shell script that starts a Java
> > background process. The job immediately is completed, but the child Java
> > process is still running on the compute node.
> > 
> > Is there a way to prevent this from happening?
> 
> What I would recommend is to use Slurm's cgroups support so that processes
> that put themselves into the background this way are tracked as part of the
> job and cleaned up when the job exits.
> 
> https://slurm.schedmd.com/cgroups.html

Hi,

I don't intend to hijack this thread but may I add a 
question here - just to be 100% sure.

Are you saying that there is absolutely no need to take care 
of potential leftover/stray processes in the epilog script any
more with proctrack/cgroup enabled?

I do have ProctrackType=proctrack/cgroup in slurm.conf but still also
have a cleanup routine in the epilog script to kill potential leftover
processes owned by the user (along with leftover semaphores, shared
memory and message queues by means of ipcrm). Is that totally
pointless when using proctrack/cgroup plugin for process tracking in
Slurm?

Best regards
Jürgen

-- 
Jürgen Salk
Scientific Software & Compute Services (SSCS)
Kommunikations- und Informationszentrum (kiz)
Universität Ulm
Telefon: +49 (0)731 50-22478
Telefax: +49 (0)731 50-22471



More information about the slurm-users mailing list