[slurm-users] help with canceling or deleteing a job
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Wed Sep 20 07:11:38 UTC 2023
On 9/20/23 01:39, Feng Zhang wrote:
> Restarting the slurmd dameon of the compute node should work, if the
> node is still online and normal.
Probably not. If the filesystem used by the job is hung, the node must
probably be rebooted, and the filesystem must be checked.
/Ole
> On Tue, Sep 19, 2023 at 8:03 AM Felix <felix at itim-cj.ro> wrote:
>>
>> Hello
>>
>> I have a job on my system which is running more than its time, more than
>> 4 days.
>>
>> 1808851 debug gridjob atlas01 CG 4-00:00:19 1 awn-047
>>
>> I'm trying to cancel it
>>
>> [@arc7-node ~]# scancel 1808851
>>
>> I get no message as if the job was canceled but when getting information
>> about the job, the job is still there
>>
>> [@arc7-node ~]# squeue | grep awn-047
>> 1808851 debug gridjob atlas01 CG 4-00:00:19 1 awn-047
>>
>> Can I do any other thinks to kill end the job?
More information about the slurm-users
mailing list