[slurm-users] Why every job will sleep 100000000
Jeffrey T Frey
frey at udel.edu
Fri Nov 4 13:21:04 UTC 2022
If you examine the process hierarchy, that "sleep 100000000" process if probably the child of a "slurmstepd: [<jobid>.extern]" process. This is a housekeeping step launched for the job by slurmd -- in older Slurm releases it would handle the X11 forwarding, for example. It should have no impact on the other steps of the job.
> On Nov 4, 2022, at 05:26 , GHui <ugiwgh at qq.com> wrote:
>
> I found a sleep process running by root, when I submit a job. And it sleep 100000000 seconds.
> Sometimes, my job is hung up. The job state is "R". Though it runs nothing, the jobscript like the following,
> ----------
> #!/bin/bash
> #SBATCH -J sub
> #SBATCH -N 1
> #SBATCH -n 1
> #SBATCH -p vpartition
>
> ----------
>
> Is it because of "sleep 100000000" process? Or how could I debug it?
>
> Any help will be appreciated.
> --GHui
More information about the slurm-users
mailing list