[slurm-users] Limiting the number of CPU
Sukman
sukman at pusat.itb.ac.id
Mon Nov 11 07:23:11 UTC 2019
Hi Brian,
I see. Thank you for your suggestion.
I definitely will try it.
Anyway, I am now suffering a new problem.
The job cannot start because of "Resources" problem.
Would anyone help on this issue?
I previously enabled these options in Slurm.conf
SelectType=select/cons_res
SelectTypeParameters=CR_Core
However, since the job didn't work well by enabling those options, now those options are disabled again.
I then, restarted Slurm in both head node and compute node.
But, now when I run a job containing this script, the job is pending.
> the script
#!/bin/bash
#SBATCH --job-name=hostname
##sbatch --time=00:50
##sbatch --mem=10M
##SBATCH --nodes=1
##SBATCH --ntasks=1
##SBATCH --ntasks-per-node=1
##SBATCH --cpus-per-task=1
##SBATCH --nodelist=cn110
srun hostname
> scontrol show job 79
JobId=79 JobName=hostname
UserId=sukman(1000) GroupId=nobody(1000) MCS_label=N/A
Priority=4294901753 Nice=0 Account=user QOS=normal_compute
JobState=PENDING Reason=Resources Dependency=(null)
Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0
RunTime=00:00:00 TimeLimit=00:01:00 TimeMin=N/A
SubmitTime=2019-11-11T14:10:41 EligibleTime=2019-11-11T14:10:41
StartTime=Unknown EndTime=Unknown Deadline=N/A
PreemptTime=None SuspendTime=None SecsPreSuspend=0
LastSchedEval=2019-11-11T14:18:41
Partition=defq AllocNode:Sid=itbhn02:11211
ReqNodeList=(null) ExcNodeList=(null)
NodeList=(null)
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,node=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=0 MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
Gres=(null) Reservation=(null)
OverSubscribe=NO Contiguous=0 Licenses=(null) Network=(null)
Command=/home/sukman/script/test_hostname.sh
WorkDir=/home/sukman/script
StdErr=/home/sukman/script/slurm-79.out
StdIn=/dev/null
StdOut=/home/sukman/script/slurm-79.out
Power=
------------------------------------------
Suksmandhira H
ITB Indonesia
----- Original Message -----
From: "Brian W. Johanson" <bjohanso at psc.edu>
To: "Slurm User Community List" <slurm-users at lists.schedmd.com>
Sent: Friday, November 8, 2019 8:58:40 PM
Subject: Re: [slurm-users] Limiting the number of CPU
Suksmandhira,
That qos specifies a walltime, cpu, and memory limit. From the job script, it appears you are within the cpu limit. But, the job script does not specify walltime nor memory and your squeue output is not showing those values (or cpu) for the job.
'scontrol show job=JOBID' will show it all values. Added flags=DenyOnLimit to the qos will reject the job when it is over the limit of a QOS, hopefully so there are not jobs that will never run sitting in queue.
-b
More information about the slurm-users
mailing list