[slurm-users] unable to kill namd3 process

Shaghuf Rahman shaghuf at gmail.com
Tue Apr 25 15:02:34 UTC 2023


Hi,

Also forgot to mention the process is still running when user do scancel
and epilog does not clean if one job finished when doing multiple job
submission.
We tried to use unkillable option but did not work. The process still
remains the same until killing it manually.



On Tue, 25 Apr 2023 at 19:57, Shaghuf Rahman <shaghuf at gmail.com> wrote:

> Hi,
>
> We are facing one issue in my environment and the behaviour looks strange
> to me. It is specifically associated with the namd3 application.
> The issue is narrated below and I have made some of the cases.
>
> I am trying to understand the way to kill the processes of the namd3
> application submitted through sbatch without making the node in drain.
>
> What I observed is when a user submits a single job on a node and then
> when he do scancel of namd3 job it kills the job and the node gets to idle
> state and everything looks as expected.
> But when the user submit multiple jobs on a single node and do scancel 1
> of his job, it puts the node in drain state. However the other jobs are
> running fine without an issue.
>
> Due to this issue multiple nodes getting to drain state when a user
> do scancel of the namd3 job.
>
> Note: When the user is not performing scancel, all job run successfully
> and the node states are also fine.
>
> It is not creating issues with any of the applications. So we are
> suspecting the issue could be with the namd3 application
> Kindly suggest some solution or any ideas on how to fix this issue.
>
> Thanks in advance,
> Shaghuf Rahman
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230425/9d17d4f7/attachment.htm>


More information about the slurm-users mailing list