<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>No, Slurm goes strictly by what the job specifies for memory at
      submit time. Slurm has no way of knowing how much memory a job
      might need in the future. The only way to safely share a node is
      for Slurm to reserve the requested memory for the duration of the
      job. To do other wise would be a disaster. <br>
    </p>
    <p>Think about it: Your node has 64 GB of RAM.  Job1 starts and
      requests 40 GB of memory, but it doesn't need that much memory
      until the last hour of an 8-hour job. For first 7 hours, it only
      needs 8 GB of RAM. 2 hours later, job2 is submitted, and will run
      for 12 hours, and needs 32 GB of memory for almost all of it's
      run-time. The node has enough cores for both jobs to run
      simultaneously. <br>
    </p>
    <p>If Slurm behaved the way you expected, job2 would start
      immediately. When job1 finally needs that 40 GB of memory, it
      tries to allocate them memory and then fails because job2 is
      already using that memory. That's not fair to job1, and this
      behavior would lead to jobs failing all the time. <br>
    </p>
    <pre class="moz-signature" cols="72">Prentice</pre>
    <div class="moz-cite-prefix">On 4/17/19 1:10 PM, Mahmood Naderan
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CADa2P2VniTa78CU25q+Bqx37vymd_9yGZgy=8y=jNkQFpdfyfQ@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">
        <div class="gmail_default" style="font-family:tahoma,sans-serif">Yes.
          It seems that what user specifies, slurm will reserve that.
          The other jobs realtime memory is less than what users had
          been specified. I thought that slurm will dynamically handles
          that in order to put more jobs in running state.</div>
        <div class="gmail_default" style="font-family:tahoma,sans-serif"><br
            clear="all">
        </div>
        <div>
          <div dir="ltr" class="gmail_signature"
            data-smartmail="gmail_signature">
            <div dir="ltr"><font face="tahoma,sans-serif">Regards,<br>
                Mahmood</font><br>
              <br>
              <br>
            </div>
          </div>
        </div>
        <br>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Apr 17, 2019 at 7:54
          PM Prentice Bisbal <<a href="mailto:pbisbal@pppl.gov"
            moz-do-not-send="true">pbisbal@pppl.gov</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div bgcolor="#FFFFFF">
            <p>Mahmood, <br>
            </p>
            <p>What do you see as the problem here? To me, there is no
              problem and the scheduler is working exactly has it
              should. The reason "Resources" means that there are not
              enough computing resources available for your job to run
              right now, so the job is setting in the queue in the
              pending state waiting for the necessary resources to
              become available. This is exactly what schedulers are<br>
            </p>
            <p>As Andreas pointed out, looking at the output of
              'scontrol show node compute-0-0' that you provided,
              compute-0-0 has 32 cores and 63 GB of RAM. Out of that 9
              cores and 55 GB of RAM have already been allocated,
              leaving 23 cores and only 8 GB of RAM available for other
              jobs. The job you submitted requested 20 cores (tasks,
              technically) and 40 GB of RAM. Since compute-0-0 doesn't
              have enough RAM available, Slurm is keeping your job in
              the queue until enough RAM is available for it to run.
              This is exactly what Slurm should be doing. <br>
            </p>
            <pre class="gmail-m_-746978286124028451moz-signature" cols="72">Prentice </pre>
            <div class="gmail-m_-746978286124028451moz-cite-prefix">On
              4/17/19 11:00 AM, Henkel, Andreas wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">I think there isn’t enough memory. </div>
              <div dir="ltr">AllocTres Shows mem=55G</div>
              <div dir="ltr">And your job wants another 40G although the
                node only has 63G in total. </div>
              <div dir="ltr">Best,</div>
              <div dir="ltr">Andreas </div>
              <div dir="ltr"><br>
                Am 17.04.2019 um 16:45 schrieb Mahmood Naderan <<a
                  href="mailto:mahmood.nt@gmail.com" target="_blank"
                  moz-do-not-send="true">mahmood.nt@gmail.com</a>>:<br>
                <br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">
                  <div dir="ltr">
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">Hi,</div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">Although
                              it was fine for previous job runs, the
                              following script now stuck as PD with the
                              reason about resources.</div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">$
                              cat slurm_script.sh<br>
                              #!/bin/bash<br>
                              #SBATCH --output=test.out<br>
                              #SBATCH --job-name=g09-test<br>
                              #SBATCH --ntasks=20<br>
                              #SBATCH --nodelist=compute-0-0<br>
                              #SBATCH --mem=40GB<br>
                              #SBATCH --account=z7<br>
                              #SBATCH --partition=EMERALD<br>
                              g09 test.gjf<br>
                              $ sbatch slurm_script.sh<br>
                              Submitted batch job 878<br>
                              $ squeue<br>
                                           JOBID PARTITION     NAME    
                              USER ST       TIME  NODES NODELIST(REASON)<br>
                                             878   EMERALD g09-test
                              shakerza PD       0:00      1 (Resources)<br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">However,
                              all things look good.</div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">$
                              sacctmgr list association
                              format=user,account,partition,grptres%20 |
                              grep shaker<br>
                              shakerzad+      local<br>
                              shakerzad+         z7    emerald      
                              cpu=20,mem=40G<br>
                              $ scontrol show node compute-0-0<br>
                              NodeName=compute-0-0 Arch=x86_64
                              CoresPerSocket=1<br>
                                 CPUAlloc=9 CPUTot=32 CPULoad=8.89<br>
                                 AvailableFeatures=rack-0,32CPUs<br>
                                 ActiveFeatures=rack-0,32CPUs<br>
                                 Gres=(null)<br>
                                 NodeAddr=10.1.1.254
                              NodeHostName=compute-0-0 Version=18.08<br>
                                 OS=Linux 3.10.0-693.5.2.el7.x86_64 #1
                              SMP Fri Oct 20 20:32:50 UTC 2017<br>
                                 RealMemory=64261 AllocMem=56320
                              FreeMem=37715 Sockets=32 Boards=1<br>
                                 State=MIXED ThreadsPerCore=1
                              TmpDisk=444124 Weight=20511900 Owner=N/A
                              MCS_label=N/A<br>
                                 Partitions=CLUSTER,WHEEL,EMERALD,QUARTZ<br>
                                 BootTime=2019-04-06T10:03:47
                              SlurmdStartTime=2019-04-06T10:05:54<br>
                                 CfgTRES=cpu=32,mem=64261M,billing=47<br>
                                 AllocTRES=cpu=9,mem=55G<br>
                                 CapWatts=n/a<br>
                                 CurrentWatts=0 LowestJoules=0
                              ConsumedJoules=0<br>
                                 ExtSensorsJoules=n/s ExtSensorsWatts=0
                              ExtSensorsTemp=n/s<br>
                              <br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif">Any
                              idea?</div>
                            <div class="gmail_default"
                              style="font-family:tahoma,sans-serif"><br>
                            </div>
                            <div>
                              <div dir="ltr"
                                class="gmail-m_-746978286124028451gmail_signature">
                                <div dir="ltr"><font
                                    face="tahoma,sans-serif">Regards,<br>
                                    Mahmood</font><br>
                                  <br>
                                  <br>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </blockquote>
            </blockquote>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>