[slurm-users] Limiting the number of CPU

Sukman sukman at pusat.itb.ac.id
Fri Nov 8 02:37:10 UTC 2019

Hi all,

I am currently having a problem in limiting the number of CPU used for running a job.
I tried to limit the CPU to just only 2 from the maximum 56.
But, when I run the job, using only 1 CPU, the QOS has been reached already.
When I set the CPU to 56, the job runs finely.

Does anyone have any suggestion regarding this problem?

Following is the details of the problem.

My node has 56 cores (2sockets x 28cores).

I configured already slurm.conf by enabling the qos/limit enforcement.


For QOS itself, I just tried applying a simple limit-CPU number to be 2.

sacctmgr show qos where Name=normal_compute format=Name,Priority,UsageFactor,MaxWall,MaxTRESPU
      Name   Priority UsageFactor     MaxWall     MaxTRESPU 
---------- ---------- ----------- ----------- ------------- 
normal_co+         10    1.000000    00:01:00  cpu=2,mem=1G

I then applied the QOS to a specific user, sukman.

#QOS-defined user
sacctmgr list association where User=sukman format=User,QOS,
      User                  QOS 
---------- -------------------- 
    sukman       normal_compute

Then, I tried to run a simple bash command, hostname, by just using 1 node, 1 task, and 1 CPU

#SBATCH --job-name=hostname
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --nodelist=cn110

srun hostname

However, the QOS has been reached already.

                68      defq hostname   sukman PD       0:00      1 (QOSMaxCpuPerUserLimit)

When I change the CPU limit to the max cores number in a server, 56 cores

sacctmgr show qos where Name=normal_compute format=Name,Priority,UsageFactor,MaxWall,MaxTRESPU
      Name   Priority UsageFactor     MaxWall     MaxTRESPU 
---------- ---------- ----------- ----------- ------------- 
normal_co+         10    1.000000    00:01:00 cpu=56,mem=1G

the script runs perfectly.

cat slurm-68.out 


Suksmandhira H
ITB Indonesia

More information about the slurm-users mailing list