[slurm-users] Limiting the number of CPU

Sukman sukman at pusat.itb.ac.id
Mon Nov 11 07:23:11 UTC 2019

Hi Brian,

I see. Thank you for your suggestion.
I definitely will try it.

Anyway, I am now suffering a new problem.
The job cannot start because of "Resources" problem.

Would anyone help on this issue?

I previously enabled these options in Slurm.conf


However, since the job didn't work well by enabling those options, now those options are disabled again.
I then, restarted Slurm in both head node and compute node.

But, now when I run a job containing this script, the job is pending.

> the script

#SBATCH --job-name=hostname
##sbatch --time=00:50
##sbatch --mem=10M
##SBATCH --nodes=1
##SBATCH --ntasks=1
##SBATCH --ntasks-per-node=1
##SBATCH --cpus-per-task=1
##SBATCH --nodelist=cn110

srun hostname

> scontrol show job 79
JobId=79 JobName=hostname
   UserId=sukman(1000) GroupId=nobody(1000) MCS_label=N/A
   Priority=4294901753 Nice=0 Account=user QOS=normal_compute
   JobState=PENDING Reason=Resources Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
   RunTime=00:00:00 TimeLimit=00:01:00 TimeMin=N/A
   SubmitTime=2019-11-11T14:10:41 EligibleTime=2019-11-11T14:10:41
   StartTime=Unknown EndTime=Unknown Deadline=N/A
   PreemptTime=None SuspendTime=None SecsPreSuspend=0
   Partition=defq AllocNode:Sid=itbhn02:11211
   ReqNodeList=(null) ExcNodeList=(null)
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   Gres=(null) Reservation=(null)
   OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)


Suksmandhira H
ITB Indonesia

----- Original Message -----
From: "Brian W. Johanson" <bjohanso at psc.edu>
To: "Slurm User Community List" <slurm-users at lists.schedmd.com>
Sent: Friday, November 8, 2019 8:58:40 PM
Subject: Re: [slurm-users] Limiting the number of CPU

That qos specifies a walltime, cpu, and memory limit.  From the job script, it appears you are within the cpu limit.  But, the job script does not specify walltime nor memory and your squeue output is not showing those values (or cpu) for the job.
'scontrol show job=JOBID' will show it all values.  Added flags=DenyOnLimit to the qos will reject the job when it is over the limit of a QOS, hopefully so there are not jobs that will never run sitting in queue.


More information about the slurm-users mailing list