[slurm-users] node showing "Low socket*core count"

Noam Bernstein noam.bernstein at nrl.navy.mil
Wed Oct 10 14:55:51 MDT 2018


> On Oct 10, 2018, at 12:07 PM, Noam Bernstein <noam.bernstein at nrl.navy.mil> wrote:
> 
> 
> slurmd -C confirms that indeed slurm understands the architecture, so that’s good.  However, removing the CPUs entry from the node list doesn’t change anything.  It still drains the node.  If I just remove _everything_ having to do with those counts from the node list item it just picks 1 cpu.
> 

Interestingly, it looks like maybe the only problem was that I had to manually set the state to Resume.  I was playing around with different settings for various properties, and definitely had things screwed up in a way that would trigger a drain state at some point.  Apparently after I fixed the situation slurm was actually OK with it, but just didn't reset the state properly.  Seems to be OK now, once I got all the items set consistently.




More information about the slurm-users mailing list