[slurm-users] Suspend QOS help

Fri Feb 18 15:54:17 UTC 2022

Both jobs would be using the whole node same as below but with two nodes. I've reduced the problem space to two isolated partitions on just node04.
NodeName=node04 CPUs=32 Boards=1 SocketsPerBoard=2 CoresPerSocket=16 ThreadsPerCore=1 RealMemory=257476 Features=cpu

# qoses have stayed the same.
      Name   Priority    Preempt PreemptMode
---------- ---------- ---------- -----------
   general       1000     suspend     cluster
   suspend       100                        cluster

# test partitions
PartitionName=test    Default=NO  Nodes=cc-cpu-04   OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=general AllowQos=general
PartitionName=suspend Default=NO  Nodes=cc-cpu-04   OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=suspend AllowQos=suspend

stress-suspend.sh
#!/bin/bash
#SBATCH -p suspend
#SBATCH -C cpu
#SBATCH -q suspend
#SBATCH -c 32
#SBATCH --ntasks-per-node=1
#SBATCH -N 1
stress -c 32 -t $1

#stress.sh
#!/bin/bash
#SBATCH -p test
#SBATCH -C cpu
#SBATCH -q general
#SBATCH -c 32
#SBATCH --ntasks-per-node=1
#SBATCH -N 1
stress -c 32 -t $1

________________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Brian Andrus <toomuchit at gmail.com>
Sent: Friday, February 18, 2022 9:36 AM
To: slurm-users at lists.schedmd.com
Subject: Re: [slurm-users] Suspend QOS help

First look and I would guess that there are enough resources to satisfy
the requests of both jobs, so no need to suspend.

Having the node info and the job info to compare would be the next step.

Brian Andrus

On 2/18/2022 7:20 AM, Walls, Mitchell wrote:
> Hello,
>
> Hoping someone can shed some light on what is causing jobs to run on same nodes simultaneously rather than being actually suspended for the lower priority job? I can provide more info if someone can think of something to help!
>
> # Relevant config.
> PreemptType=preempt/qos
> PreemptMode=SUSPEND,GANG
>
> PartitionName=general Default=YES Nodes=general     OverSubscribe=FORCE:1 MaxTime=30-00:00:00   Qos=general  AllowQos=general
> PartitionName=suspend Default=NO  Nodes=general     OverSubscribe=FORCE:1 MaxTime=30-00:00:00 Qos=suspend AllowQos=suspend
>
> # Qoses
>        Name   Priority    Preempt PreemptMode
> ---------- ---------- ---------- -----------
>     general       1000     suspend     cluster
>     suspend       100                        cluster
>
> # squeue (another note is I see that both processes are actually running at same time and not being timesliced in htop)
> $ squeue
>               JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
>               45085   general  stress.s   user2   R       7:33     2 node[04-05]
>               45084   suspend stress-s  user1   R       7:40     2 node[04-05]
>
> Thanks!