[slurm-users] Compact scheduling strategy for small GPU jobs

Tue Aug 10 15:41:41 UTC 2021

Did Diego's suggestion from [1] not help narrow things down?

[1] https://lists.schedmd.com/pipermail/slurm-users/2021-August/007708.html

From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Jack Chen <scsvip at gmail.com>
Date: Tuesday, August 10, 2021 at 10:08 AM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Compact scheduling strategy for small GPU jobs

External Email Warning

This email originated from outside the university. Please use caution when opening attachments, clicking links, or responding to requests.

________________________________
Does anyone have any ideas on this?

On Fri, Aug 6, 2021 at 2:52 PM Jack Chen <scsvip at gmail.com<mailto:scsvip at gmail.com>> wrote:
I'm using slurm15.08.11, when I submit several 1 gpu jobs, slurm doesn't allocate nodes using compact strategy. Anyone know how to solve this? Will upgrading slurm latest version help ?

For example, there are two nodes A and B with 8 gpus per node, I submitted 8 1 gpu jobs, slurm will allocate first 6 jobs on node A, then last 2 jobs on node B. Then when I submit one job with 8 gpus, it will pending because of gpu fragments: nodes A has 2 idle gpus, node b 6 idle gpus

Thanks in advance!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210810/29dac68d/attachment-0001.htm>