<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=windows-1252">
  </head>
  <body>
    <p>Mike, <br>
    </p>
    <p>You don't include your entire sbatch script, so it's really hard
      to say what's going wrong when we only have a single line to work
      with. Based on what you have told us, I'm guessing you are
      specifying a memory requirement per node greater than 128000. When
      you specify a nodelist, Slurm will assign your job to all of those
      nodes, not a subset that matches the other job specifications
      (--mem or --mem-per-cpu, or --tasks, etc.):</p>
    <p>
      <blockquote type="cite">
        <dl compact="compact">
          <dt><b>-w</b>, <b>--nodelist</b>=<<i>node name list</i>></dt>
          <dd>
            Request a specific list of hosts.
            The job will contain <i>all</i> of these hosts and possibly
            additional hosts
            as needed to satisfy resource requirements.
          </dd>
        </dl>
      </blockquote>
      <br>
    </p>
    <pre class="moz-signature" cols="72">Prentice </pre>
    <div class="moz-cite-prefix">On 6/7/21 7:46 PM, Yap, Mike wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:SY2PR01MB2540C25B915D7F4E3E2CA80BD7389@SY2PR01MB2540.ausprd01.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style>@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0cm;
        margin-right:0cm;
        margin-bottom:0cm;
        margin-left:36.0pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri",sans-serif;
        color:windowtext;}.MsoChpDefault
        {mso-style-type:export-only;
        font-family:"Calibri",sans-serif;
        mso-fareast-language:EN-US;}div.WordSection1
        {page:WordSection1;}ol
        {margin-bottom:0cm;}ul
        {margin-bottom:0cm;}</style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span lang="EN-US">Hi All<o:p></o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US">Can another advise the
            possibilities of me encountering the error message as below
            when submitting a job ?<o:p></o:p></span></p>
        <p class="MsoNormal"><b><span lang="EN-US">sbatch: error: memory
              allocation failure<o:p></o:p></span></b></p>
        <p class="MsoNormal"><span lang="EN-US">The same script use work
            perfectly fine until I include  <b>#SBATCH
              --nodelist=(compute[015-046])  (once removed it work as it
              should)<o:p></o:p></b></span></p>
        <p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US">The issues<o:p></o:p></span></p>
        <ol style="margin-top:0cm" type="1" start="1">
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1"><span
              lang="EN-US">For the current setup, I have specific
              resources available for each compute node
              <o:p></o:p></span></li>
          <ol style="margin-top:0cm" type="a" start="1">
            <li class="MsoListParagraph"
              style="margin-left:0cm;mso-list:l0 level2 lfo1"><span
                lang="EN-US">(NodeName=compute[007-014] Procs=36
                CoresPerSocket=18 RealMemory=384000 ThreadsPerCore=1
                Boards=1 SocketsPerBoard=2) – newer model<o:p></o:p></span></li>
            <li class="MsoListParagraph"
              style="margin-left:0cm;mso-list:l0 level2 lfo1"><span
                lang="EN-US">(NodeName=compute[001-006] Procs=16
                CoresPerSocket=18 RealMemory=128000 ThreadsPerCore=1
                Boards=1 SocketsPerBoard=2)<o:p></o:p></span></li>
          </ol>
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1"><span
              lang="EN-US">I have same resources sharing between
              multiple queue (working fine)<o:p></o:p></span></li>
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1"><span
              lang="EN-US">When running on parallel job, the exact same
              job run when assigned to the same node category (ie
              exclusively on 1a or 1b)<o:p></o:p></span></li>
          <li class="MsoListParagraph"
            style="margin-left:0cm;mso-list:l0 level1 lfo1"><span
              lang="EN-US">When running the exact same jobs but assigned
              between 1a and 1b, the job will run on 1b node but no
              activities on 1a
              <o:p></o:p></span></li>
        </ol>
        <p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US">Any suggestion<o:p></o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US">Thanks<o:p></o:p></span></p>
        <p class="MsoNormal"><span lang="EN-US">Mike<o:p></o:p></span></p>
      </div>
    </blockquote>
  </body>
</html>