[slurm-users] Core reserved/bound to a GPU

Chris Samuel chris at csamuel.org
Tue Sep 1 04:36:49 UTC 2020


On Monday, 31 August 2020 7:41:13 AM PDT Manuel BERTRAND wrote:

> Every thing works great so far but now I would like to bound a specific
> core to each GPUs on each node. By "bound" I mean to make a particular
> core not assignable to a CPU job alone so that the GPU is available
> whatever the CPU workload on the node.

What I've done in the past (waves to Swinburne folks on the list) was to have 
overlapping partitions on GPU nodes where the GPU job partition had access to 
all the cores and the CPU only job partition had access to only a subset 
(limited by the MaxCPUsPerNode parameter on the partition).

The problem you run into there though is that there's no way to reserve cores 
on a particular socket, which means problems for folks who care about locality 
for GPU codes as they can wait in the queue with GPUs free and cores free but 
not the right cores on the right socket to be able to use the GPUs. :-(

Here's my bug from when I was in Australia for this issue where I suggested a 
MaxCPUsPerSocket parameter for partitions:

https://bugs.schedmd.com/show_bug.cgi?id=4717

All the best,
Chris
-- 
  Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA






More information about the slurm-users mailing list