<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<div dir="auto">What services did you restart after changing the slurm.conf? Did you do an scontrol reconfigure?
<div dir="auto"><br>
</div>
<div dir="auto">Do you have any reservations? scontrol show res</div>
<div dir="auto"><br>
</div>
<div dir="auto">Sean</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, 17 Dec. 2019, 10:35 pm Mahmood Naderan, <<a href="mailto:mahmood.nt@gmail.com">mahmood.nt@gmail.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div dir="ltr">
<div class="gmail_default" style="font-family:tahoma,sans-serif">
<div>>Your running job is requesting 6 CPUs per node (4 nodes, 6 CPUs per node). That means 6 CPUs are being used on node hpc.</div>
<div>>Your queued job is requesting 5 CPUs per node (4 nodes, 5 CPUs per node). In total, if it was running, that would require 11 CPUs on node hpc. But hpc only has 10 cores, so it can't run.</div>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Right... I changed that but still the job is in pending state.</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"></div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">I modified /etc/slurm/slurm.conf as below</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"># grep hpc /etc/slurm/slurm.conf<br>
NodeName=hpc NodeAddr=10.1.1.1 CPUs=11</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
# for i in {0..2}; do scontrol show node compute-0-$i | grep RealMemory; done && scontrol show node hpc | grep RealMemory<br>
   RealMemory=64259 AllocMem=1024 FreeMem=57116 Sockets=32 Boards=1<br>
   RealMemory=120705 AllocMem=1024 FreeMem=66403 Sockets=32 Boards=1<br>
   RealMemory=64259 AllocMem=1024 FreeMem=39966 Sockets=32 Boards=1<br>
   RealMemory=64259 AllocMem=1024 FreeMem=49189 Sockets=11 Boards=1<br>
# for i in {0..2}; do scontrol show node compute-0-$i | grep CPUTot; done && scontrol show node hpc | grep CPUTot<br>
   CPUAlloc=6 CPUTot=32 CPULoad=5.18<br>
   CPUAlloc=6 CPUTot=32 CPULoad=18.94<br>
   CPUAlloc=6 CPUTot=32 CPULoad=5.41<br>
   CPUAlloc=6 CPUTot=11 CPULoad=5.21<br>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">But still the job is pending</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">$ scontrol show -d job 129<br>
JobId=129 JobName=qe-fb<br>
   UserId=mahmood(1000) GroupId=mahmood(1000) MCS_label=N/A<br>
   Priority=1751 Nice=0 Account=fish QOS=normal WCKey=*default<br>
   JobState=PENDING Reason=Resources Dependency=(null)<br>
   Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0<br>
   DerivedExitCode=0:0<br>
   RunTime=00:00:00 TimeLimit=30-00:00:00 TimeMin=N/A<br>
   SubmitTime=2019-12-17T15:00:37 EligibleTime=2019-12-17T15:00:37<br>
   AccrueTime=2019-12-17T15:00:37<br>
   StartTime=Unknown EndTime=Unknown Deadline=N/A<br>
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-12-17T15:00:38<br>
   Partition=SEA AllocNode:Sid=<a href="http://hpc.scu.ac.ir:14534" target="_blank" rel="noreferrer">hpc.scu.ac.ir:14534</a><br>
   ReqNodeList=(null) ExcNodeList=(null)<br>
   NodeList=(null)<br>
   NumNodes=4-4 NumCPUs=20 NumTasks=20 CPUs/Task=1 ReqB:S:C:T=0:0:*:*<br>
   TRES=cpu=20,mem=40G,node=4,billing=20<br>
   Socks/Node=* NtasksPerN:B:S:C=5:0:*:* CoreSpec=*<br>
   MinCPUsNode=5 MinMemoryNode=10G MinTmpDiskNode=0<br>
   Features=(null) DelayBoot=00:00:00<br>
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)<br>
   Command=/home/mahmood/qe/f_borophene/slurm_qe.sh<br>
   WorkDir=/home/mahmood/qe/f_borophene<br>
   StdErr=/home/mahmood/qe/f_borophene/my_fb.log<br>
   StdIn=/dev/null<br>
   StdOut=/home/mahmood/qe/f_borophene/my_fb.log<br>
   Power=</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><span class="gmail_default" style="font-family:tahoma,sans-serif"></span>>I'm not aware of any nodes, that have 32, or even 10 sockets. Are you sure, you want to use the cluster like that?</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Marcus,</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">I have installed slurm via slurm roll on Rocks. All 4 nodes are dual socket Opetron 6282 with the following specs</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">Thread(s) per core:    2<br>
Core(s) per socket:    8<br>
Socket(s):             2</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">I just wrote 11 CPUs for the head node in order to not fully utilize the head node with jobs.</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">For example, compute-0-0 is</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif">$ scontrol show node compute-0-0<br>
NodeName=compute-0-0 Arch=x86_64 CoresPerSocket=1<br>
   CPUAlloc=6 CPUTot=32 CPULoad=5.15<br>
   AvailableFeatures=rack-0,32CPUs<br>
   ActiveFeatures=rack-0,32CPUs<br>
   Gres=(null)<br>
   NodeAddr=10.1.1.254 NodeHostName=compute-0-0<br>
   OS=Linux 3.10.0-1062.1.2.el7.x86_64 #1 SMP Mon Sep 30 14:19:46 UTC 2019<br>
   RealMemory=64259 AllocMem=1024 FreeMem=57050 Sockets=32 Boards=1<br>
   State=MIXED ThreadsPerCore=1 TmpDisk=444124 Weight=20511900 Owner=N/A MCS_label=N/A<br>
   Partitions=CLUSTER,WHEEL,SEA<br>
   BootTime=2019-10-10T19:01:38 SlurmdStartTime=2019-12-17T13:50:37<br>
   CfgTRES=cpu=32,mem=64259M,billing=47<br>
   AllocTRES=cpu=6,mem=1G<br>
   CapWatts=n/a<br>
   CurrentWatts=0 AveWatts=0<br>
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s<br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div class="gmail_default" style="font-family:tahoma,sans-serif"><br>
</div>
<div>
<div dir="ltr" data-smartmail="gmail_signature">
<div dir="ltr"><font face="tahoma,sans-serif">Regards,<br>
Mahmood</font><br>
<br>
<br>
</div>
</div>
</div>
<br>
</div>
<br>
</div>
</blockquote>
</div>
</body>
</html>