[slurm-users] Slurm preemption grace time

Jon Tegner tegner at renget.se
Mon Nov 20 10:01:17 MST 2017


Hi,

could you try submitting the following script:

Script job.sh:
******************************
#!/bin/bash
#SBATCH -p test-low
#SBATCH -n 3
#SBATCH -t 12:00:00
sig_term()
{
echo "function sig_term called.  Exiting"
echo 'sig_term' > slask_term
echo $(date) >> slask_term
}
# associate the function "term_handler" with the TERM signal
trap 'sig_term' SIGTERM

sleep 1000 &
wait $!
******************************

and see if you catch the first SIGTERM. When I tried this signal was 
ONLY caught at the end of the grace time.

(I'll try your settings as soon as my system is up again)

Regards,

/jon

On 11/20/2017 04:21 PM, Ailing Zhang wrote:
>
> Hi slurm community,
>
> I'm testing preemption with partition based preemption. Partitions 
> test-high and test-low share the same nodes. I set GraceTime=600 
> and PreemptMode=CANCEL in test-low. But once I submitted a job to 
> test-high, job in test-low is immediately killed without any grace time.
> Here is my configs.
> PartitionName=test-low
>    AllowGroups=admins AllowAccounts=ALL AllowQos=ALL
>    AllocNodes=ALL Default=NO QoS=N/A
>    DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=600 
> Hidden=NO
>    MaxNodes=UNLIMITED MaxTime=02:00:00 MinNodes=1 LLN=NO 
> MaxCPUsPerNode=UNLIMITED
>    Nodes=node[100-102]
>    PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO 
> OverSubscribe=NO
>    OverTimeLimit=NONE PreemptMode=CANCEL
>    State=UP TotalCPUs=100 TotalNodes=3 SelectTypeParameters=NONE
>    DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
>
> PartitionName=test-high
>    AllowGroups=admins AllowAccounts=ALL AllowQos=ALL
>    AllocNodes=ALL Default=NO QoS=N/A
>    DefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 
> Hidden=NO
>    MaxNodes=UNLIMITED MaxTime=02:00:00 MinNodes=1 LLN=NO 
> MaxCPUsPerNode=UNLIMITED
>    Nodes=node[100-102] PriorityJobFactor=30 PriorityTier=30 
> RootOnly=NO ReqResv=NO OverSubscribe=NO
>    OverTimeLimit=NONE PreemptMode=OFF
>    State=UP TotalCPUs=100 TotalNodes=3 SelectTypeParameters=NONE
>    DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
>
> Any help will be much appreciated.
>
> Thanks!
> Ailing

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171120/864845bf/attachment-0001.html>


More information about the slurm-users mailing list