<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
p.Code, li.Code, div.Code
{mso-style-name:Code;
mso-style-link:"Code Char";
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
background:#FFF2CC;
font-size:11.0pt;
font-family:"Courier New";
mso-fareast-language:EN-US;}
span.CodeChar
{mso-style-name:"Code Char";
mso-style-link:Code;
font-family:"Courier New";
background:#FFF2CC;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-GB" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">After a bit more investigation it seem it is only jobs which request GPUs which are not starting.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Other jobs start OK, but just requesting a GPU sit in Pending (Resources) state until the controller is restarted, even if no jobs are running on the node at all. This definitely doesn’t seem right to me.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">There are currently user jobs on the node but if it frees up I can run some more tests regarding if jobs submitted after a controller restart start once and only once per GPU or what is going on.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Many thanks,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Luke<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">--
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Luke Sudbery<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Principal Engineer (HPC and Storage).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Architecture, Infrastructure and Systems<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Advanced Research Computing, IT Services<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Room 132, Computer Centre G5, Elms Road<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Please note I don’t work on Monday.<o:p></o:p></span></b></p>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="mso-fareast-language:EN-GB">From:</span></b><span lang="EN-US" style="mso-fareast-language:EN-GB"> slurm-users <slurm-users-bounces@lists.schedmd.com>
<b>On Behalf Of </b>Luke Sudbery<br>
<b>Sent:</b> 09 May 2023 17:38<br>
<b>To:</b> slurm-users@schedmd.com<br>
<b>Subject:</b> [slurm-users] Job scheduling bug?<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">We recently upgraded from 20.11.9 to 22.05.8 and appear to have a problem with jobs not being scheduled on nodes with free resources since then.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">It particularly noticeable on one particular partition with only one GPU node in it. Jobs queuing for this node are the highest priority in the queue at the moment, and the node is idle, but the job does not start:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ squeue -p broadwell-gpum60-ondemand --format "%.18i %.9P %.2t %.10M %.6D %30R %Q"</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> JOBID PARTITION ST TIME NODES NODELIST(REASON) PRIORITY</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66631657 broadwell PD 0:00 1 (Resources) 230</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66609948 broadwell PD 0:00 1 (Resources) 203</span><o:p></o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ squeue --format "%Q %i" --sort -Q | head -4</span><o:p></o:p></p>
<p class="Code"><span style="color:black">PRIORITY JOBID</span><o:p></o:p></p>
<p class="Code"><span style="color:black">230 66631657</span><o:p></o:p></p>
<p class="Code"><span style="color:black">212 66622378</span><o:p></o:p></p>
<p class="Code"><span style="color:black">210 66322847</span><o:p></o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ scontrol show node bear-pg0212u17b</span><o:p></o:p></p>
<p class="Code"><span style="color:black">NodeName=bear-pg0212u17b Arch=x86_64 CoresPerSocket=10</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> CPUAlloc=0 CPUEfctv=20 CPUTot=20 CPULoad=0.01</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> AvailableFeatures=haswell</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> ActiveFeatures=haswell</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Gres=gpu:m60:2(S:0-1)</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> NodeAddr=bear-pg0212u17b NodeHostName=bear-pg0212u17b Version=22.05.8</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> OS=Linux 3.10.0-1160.49.1.el7.x86_64 #1 SMP Tue Nov 30 15:51:32 UTC 2021</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> RealMemory=511000 AllocMem=0 FreeMem=501556 Sockets=2 Boards=1</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> MemSpecLimit=501</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Partitions=broadwell-gpum60-ondemand,system</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> BootTime=2023-04-25T08:24:10 SlurmdStartTime=2023-05-04T11:57:46</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> LastBusyTime=2023-05-09T13:27:07</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> CfgTRES=cpu=20,mem=511000M,billing=20,gres/gpu=2</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> AllocTRES=</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> CapWatts=n/a</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> CurrentWatts=0 AveWatts=0</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s</span><o:p></o:p></p>
<p class="Code"><o:p> </o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">The resources it requests easily met by the node:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ scontrol show job 66631657</span><o:p></o:p></p>
<p class="Code"><span style="color:black">JobId=66631657 JobName=sys/dashboard/sys/bc_uob_paraview</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> UserId=XXXX(633299) GroupId=users(100) MCS_label=N/A</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Priority=230 Nice=0 Account=XXXX QOS=bbondemand</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> JobState=PENDING Reason=Resources Dependency=(null)</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> RunTime=00:00:00 TimeLimit=02:00:00 TimeMin=N/A</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> SubmitTime=2023-05-09T13:27:31 EligibleTime=2023-05-09T13:27:31</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> AccrueTime=2023-05-09T13:27:31</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> StartTime=Unknown EndTime=Unknown Deadline=N/A</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> SuspendTime=None SecsPreSuspend=0 LastSchedEval=2023-05-09T16:02:30 Scheduler=Main</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Partition=broadwell-gpum60-ondemand,cascadelake-hdr-ondemand,cascadelake-hdr-ondemand2 AllocNode:Sid=localhost:1120095</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> ReqNodeList=(null) ExcNodeList=(null)</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> NodeList=</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> NumNodes=1-1 NumCPUs=8 NumTasks=8 CPUs/Task=1 ReqB:S:C:T=0:0:*:*</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> TRES=cpu=8,mem=32G,node=1,billing=8,gres/gpu=1</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> MinCPUsNode=1 MinMemoryCPU=4G MinTmpDiskNode=0</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Features=(null) DelayBoot=00:00:00</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> OverSubscribe=YES Contiguous=0 Licenses=(null) Network=(null)</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Command=(null)</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> WorkDir=/XXXXXXXXXXXXX</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> StdErr=/XXXXXXXXXXXXX/output.log</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> StdIn=/dev/null</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> StdOut=/XXXXXXXXXXXXX/output.log</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> Power=</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> TresPerNode=gres:gpu:1</span><o:p></o:p></p>
<p class="Code"><o:p> </o:p></p>
<p class="Code"><o:p> </o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">This looks a bug to me because it was working fine before the upgrade and a simple restart of the slurm controller will often allow the jobs to start, without any other changes:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ squeue -p broadwell-gpum60-ondemand --format "%.18i %.9P %.2t %.10M %.6D %32R %Q"</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> JOBID PARTITION ST TIME NODES NODELIST(REASON) PRIORITY</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66631657 broadwell PD 0:00 1 (Resources) 230</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66609948 broadwell PD 0:00 1 (Resources) 203</span><o:p></o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$ sudo systemctl restart slurmctld; sleep 30; squeue -p broadwell-gpum60-ondemand --format "%.18i %.9P %.2t %.10M %.6D %32R %Q"</span><o:p></o:p></p>
<p class="Code"><span style="color:black">Job for slurmctld.service canceled.</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> JOBID PARTITION ST TIME NODES NODELIST(REASON) PRIORITY</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66631657 broadwell R 0:04 1 bear-pg0212u17b 230</span><o:p></o:p></p>
<p class="Code"><span style="color:black"> 66609948 broadwell R 0:04 1 bear-pg0212u17b 203</span><o:p></o:p></p>
<p class="Code"><span style="color:black">[sudberlr-admin@bb-er-slurm01 ~]$</span><o:p></o:p></p>
<p class="Code"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Has anyone come across this behaviour or have any other ideas?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Many thanks,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Luke<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">--
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Luke Sudbery<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Principal Engineer (HPC and Storage).<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Architecture, Infrastructure and Systems<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Advanced Research Computing, IT Services<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Room 132, Computer Centre G5, Elms Road<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:9.0pt;color:#1F497D;mso-fareast-language:EN-GB">Please note I don’t work on Monday.<o:p></o:p></span></b></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>