[slurm-users] Users can't scancel

William Markuske wmarkuske at sdsc.edu
Wed Nov 18 17:00:58 UTC 2020


Hello,

I am having an odd problem where users are unable to kill their jobs 
with scancel. Users can submit jobs just fine and when the task 
completes it is able to close correctly. However, if a user attempts to 
cancel a job via scancel the SIGKILL signals are sent to the step but 
don't complete. Slurmd then continues to send SIGKILL requests until the 
UnkillableTimeout is hit, the slurm job is exits with an error, the node 
enters a draining state, and the spawn processes continue to run on the 
node.

I'm at a loss because jobs can complete without issue which seems to 
suggest it's not a networking or permissions issue for the slurm to do 
job accounting tasks. A user can ssh to the node once a job is submitted 
and kill the subprocesses manually at which point slurm completes the 
epilog and the node returns to idle.

Does anyone know what may be causing such behavior? Please let me know 
any slurm.conf or cgroup.conf settings that would be helpful to diagnose 
this issue. I'm quite stumped by this one.

-- 

Willy Markuske

HPC Systems Engineer

	

Research Data Services

P: (858) 246-5593

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201118/585ea8c1/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SDSClogo-plusname-red.jpg
Type: image/jpeg
Size: 9464 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201118/585ea8c1/attachment.jpg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0xD42F81D406AC0BA2.asc
Type: application/pgp-keys
Size: 3228 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201118/585ea8c1/attachment.key>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 495 bytes
Desc: OpenPGP digital signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201118/585ea8c1/attachment.sig>


More information about the slurm-users mailing list