<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none"><!-- p { margin-top: 0px; margin-bottom: 0px; }--></style>
</head>
<body dir="ltr" style="font-size:12pt;color:#000000;background-color:#FFFFFF;font-family:Calibri,Arial,Helvetica,sans-serif;">
<p>Hi,<br>
</p>
<p><br>
</p>
<p>We recently set up fair tree scheduling (we have 19.05 running), and are trying to use sshare to see usage information. Unfortunately, sshare reports all zeros, even though there seems to be data in the backend DB. Here's an example output: <br>
</p>
<p><br>
</p>
<div><span style="font-family:"Courier New",monospace">$ sshare -l</span></div>
<div><span style="font-family:"Courier New",monospace"> Account User RawShares NormShares RawUsage NormUsage EffectvUsage FairShare LevelFS GrpTRESMins TRESRunMins </span></div>
<div><span style="font-family:"Courier New",monospace">-------------------- ---------- ---------- ----------- ----------- ----------- ------------- ---------- ---------- ------------------------------ ------------------------------ </span></div>
<div><span style="font-family:"Courier New",monospace">root 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> covid 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> covid-01 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> covid-02 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> group1 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> subgroup1 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> othersubgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> subgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> subgroups 4 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> subgroups 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> SUBGROUP 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+ </span></div>
<div><span style="font-family:"Courier New",monospace"> SUBGROUP 1 0 0.000000 0.000000 cpu=0,mem=0,energy=0,node=0,b+</span><br>
</div>
<p><br>
</p>
<p><br>
</p>
<p>And the slurm.conf config: <br>
</p>
<p><br style="font-family: "Courier New", monospace;">
</p>
<div><span style="font-family: "Courier New", monospace;">ClusterName=trixie</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldHost=trixie(10.10.0.11)</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldHost=hn2(10.10.0.12)</span></div>
<div><span style="font-family: "Courier New", monospace;">GresTypes=gpu</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmUser=slurm</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldPort=6817</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdPort=6818</span></div>
<div><span style="font-family: "Courier New", monospace;">AuthType=auth/munge</span></div>
<div><span style="font-family: "Courier New", monospace;">StateSaveLocation=/gpfs/share/slurm/</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdSpoolDir=/var/spool/slurm/d</span></div>
<div><span style="font-family: "Courier New", monospace;">SwitchType=switch/none</span></div>
<div><span style="font-family: "Courier New", monospace;">MpiDefault=none</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldPidFile=/var/run/slurmctld.pid</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdPidFile=/var/run/slurmd.pid</span></div>
<div><span style="font-family: "Courier New", monospace;">ProctrackType=proctrack/cgroup</span></div>
<div><span style="font-family: "Courier New", monospace;">ReturnToService=2</span></div>
<div><span style="font-family: "Courier New", monospace;">PrologFlags=x11</span></div>
<div><span style="font-family: "Courier New", monospace;">TaskPlugin=task/cgroup</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># TIMERS</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldTimeout=60</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdTimeout=300</span></div>
<div><span style="font-family: "Courier New", monospace;">InactiveLimit=0</span></div>
<div><span style="font-family: "Courier New", monospace;">MinJobAge=300</span></div>
<div><span style="font-family: "Courier New", monospace;">KillWait=30</span></div>
<div><span style="font-family: "Courier New", monospace;">Waittime=0</span></div>
<div><span style="font-family: "Courier New", monospace;">#</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># SCHEDULING</span></div>
<div><span style="font-family: "Courier New", monospace;">SchedulerType=sched/backfill</span></div>
<div><span style="font-family: "Courier New", monospace;">SelectType=select/cons_res</span></div>
<div><span style="font-family: "Courier New", monospace;">SelectTypeParameters=CR_Core_Memory</span></div>
<div><span style="font-family: "Courier New", monospace;">FastSchedule=1</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;">SchedulerParameters=bf_interval=60,bf_continue,bf_resolution=600,bf_window=2880,bf_max_job_test=5000,bf_max_job_part=1000,bf_max_job_user=10,bf_max_job_start=100</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;">PriorityType=priority/multifactor</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityDecayHalfLife=14-0</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityWeightFairshare=100000</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityWeightAge=1000</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityWeightPartition=10000</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityWeightJobSize=1000</span></div>
<div><span style="font-family: "Courier New", monospace;">PriorityMaxAge=1-0</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># LOGGING</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldDebug=3</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmctldLogFile=/var/log/slurmctld.log</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdDebug=3</span></div>
<div><span style="font-family: "Courier New", monospace;">SlurmdLogFile=/var/log/slurmd.log</span></div>
<div><span style="font-family: "Courier New", monospace;">JobCompType=jobcomp/none</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># ACCOUNTING</span></div>
<div><span style="font-family: "Courier New", monospace;">JobAcctGatherType=jobacct_gather/linux</span></div>
<div><span style="font-family: "Courier New", monospace;">AccountingStorageType=accounting_storage/slurmdbd</span></div>
<div><span style="font-family: "Courier New", monospace;">AccountingStorageHost=hn2</span></div>
<div><span style="font-family: "Courier New", monospace;">AccountingStorageTRES=gres/gpu</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># COMPUTE NODES</span></div>
<div><span style="font-family: "Courier New", monospace;">NodeName=cn[101-136] Procs=32 Gres=gpu:4 RealMemory=192782</span></div>
<div><br style="font-family: "Courier New", monospace;">
</div>
<div><span style="font-family: "Courier New", monospace;"># Partitions</span></div>
<div><span style="font-family: "Courier New", monospace;">PartitionName=JobTesting Nodes=cn[135-136] MaxTime=02:00:00 DefaultTime=00:30:00 MaxMemPerNode=192782 AllowGroups=DT-AI4DCluster-All State=UP</span></div>
<div><span style="font-family: "Courier New", monospace;">PartitionName=TrixieMain Nodes=cn[106-134] MaxTime=48:00:00 DefaultTime=08:00:00 MaxMemPerNode=192782 AllowGroups=DT-AI4DCluster-All State=UP Default=YES</span></div>
<div><span style="font-family: "Courier New", monospace;">PartitionName=ItOpsTests Nodes=cn[102-105] MaxTime=INFINITE MaxMemPerNode=192782 AllowGroups=Admin-Access,Manager-Access State=UP</span></div>
<div><span style="font-family: "Courier New", monospace;">PartitionName=ItOpsImage Nodes=cn101 MaxTime=INFINITE MaxMemPerNode=192782 AllowGroups=Admin-Access State=UP</span></div>
<div><br>
Anything that would explain sshare returns only zeros?<br>
</div>
<div><br>
</div>
<p>The only particularity I can think of is that I don't think we reloaded slurmctld, but just reconfigured.<br>
</p>
<p><br>
</p>
<p>Cheers,<br>
</p>
<p><br>
</p>
<div id="Signature">
<div name="divtagdefaultwrapper" style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:; margin:0">
<font style=""><font style="">Joey Dumont</font></font>
<div style=""><br>
</div>
<div style="">Technical Advisor, Knowledge, Information, and Technology Services</div>
<div style="">National Research Council Canada / Governement of Canada</div>
<div style=""><a tabindex="0" href="mailto:joey.dumont@nrc-cnrc.gc.ca" id="NoLP">joey.dumont@nrc-cnrc.gc.ca</a> / Tel: 613-990-8152 / Cell: 438-340-7436</div>
<div style=""><br>
</div>
<div style="">Conseiller technique, Services du savoir, de l'information et de la technologie</div>
<div style="">Conseil national de recherches Canada / Gouvernement du Canada</div>
<div><a tabindex="0" href="mailto:joey.dumont@nrc-cnrc.gc.ca" id="NoLP">joey.dumont@nrc-cnrc.gc.ca</a> / Tél.: 613-990-8152 / Tél. cell.: 438-340-7436<br>
</div>
</div>
</div>
</body>
</html>