[slurm-users] Job not cancelled after "TimeLimit" supered
Gestió Servidors
sysadmin.caos at uab.cat
Tue Mar 10 13:13:52 UTC 2020
Hello,
I have checked my configuration with "scontrol show config" and these are the values of that three parameters:
AccountingStorageEnforce = none
EnforcePartLimits = NO
OverTimeLimit = 500 min
...so now I understand by my job hasn't been cancelled after 8 hours... because there are 500 more minutes...
Thanks.
> ------------------------------
>
> Message: 2
> Date: Tue, 10 Mar 2020 11:25:08 +0100
> From: Ole Holm Nielsen <Ole.H.Nielsen at fysik.dtu.dk>
> To: <slurm-users at lists.schedmd.com>
> Subject: Re: [slurm-users] Job not cancelled after "TimeLimit" supered
> Message-ID: <4491dd44-2f42-f0ca-2527-5eab3942201b at fysik.dtu.dk>
> Content-Type: text/plain; charset="utf-8"; format=flowed
>
> On 3/10/20 9:03 AM, sysadmin.caos wrote:
> > my SLURM cluster has configured a partition with a "TimeLimit" of 8 hours.
> > Now, a job is running during 9h30m and it has been not cancelled.
> > During these 9 hours and a half, a script has executed a "scontrol
> > update partition=mypartition state=down" for disabling this partition
> > (educational cluster and at 8:00 start students classes).
> >
> > Why my job hasn't been cancelled? There is no any log at SLURM
> > controller that explains this behaviour.
>
> You may want to check the following parameter in your slurm.conf file (read
> the man-page first):
>
> AccountingStorageEnforce: This controls what level of association-based
> enforcement to impose on job submissions.
>
> You may want to read about EnforcePartLimits and OverTimeLimit
> parameters as well.
>
> Display your current configuration by: scontrol show config
>
> /Ole
>
>
More information about the slurm-users
mailing list