[slurm-users] unable to kill namd3 process

Shaghuf Rahman shaghuf at gmail.com
Tue Apr 25 14:27:48 UTC 2023


We are facing one issue in my environment and the behaviour looks strange
to me. It is specifically associated with the namd3 application.
The issue is narrated below and I have made some of the cases.

I am trying to understand the way to kill the processes of the namd3
application submitted through sbatch without making the node in drain.

What I observed is when a user submits a single job on a node and then when
he do scancel of namd3 job it kills the job and the node gets to idle state
and everything looks as expected.
But when the user submit multiple jobs on a single node and do scancel 1 of
his job, it puts the node in drain state. However the other jobs are
running fine without an issue.

Due to this issue multiple nodes getting to drain state when a user
do scancel of the namd3 job.

Note: When the user is not performing scancel, all job run successfully and
the node states are also fine.

It is not creating issues with any of the applications. So we are
suspecting the issue could be with the namd3 application
Kindly suggest some solution or any ideas on how to fix this issue.

Thanks in advance,
Shaghuf Rahman
