[slurm-users] help with canceling or deleteing a job

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Sep 20 07:11:38 UTC 2023


On 9/20/23 01:39, Feng Zhang wrote:
> Restarting the slurmd dameon of the compute node should work, if the
> node is still online and normal.

Probably not.  If the filesystem used by the job is hung, the node must 
probably be rebooted, and the filesystem must be checked.

/Ole

> On Tue, Sep 19, 2023 at 8:03 AM Felix <felix at itim-cj.ro> wrote:
>>
>> Hello
>>
>> I have a job on my system which is running more than its time, more than
>> 4 days.
>>
>> 1808851     debug  gridjob  atlas01 CG 4-00:00:19      1 awn-047
>>
>> I'm trying to cancel it
>>
>> [@arc7-node ~]# scancel 1808851
>>
>> I get no message as if the job was canceled but when getting information
>> about the job, the job is still there
>>
>> [@arc7-node ~]# squeue | grep awn-047
>>              1808851     debug  gridjob  atlas01 CG 4-00:00:19 1 awn-047
>>
>> Can I do any other thinks to kill end the job?



More information about the slurm-users mailing list