[slurm-users] Keep CPU Jobs Off GPU Nodes
Ward Poelmans
ward.poelmans at vub.be
Wed Mar 29 06:57:43 UTC 2023
Hi,
We have a dedicated partitions for GPUs (their name ends with _gpu) and simply forbid a job that is not requesting GPU resources to use this partition:
local function job_total_gpus(job_desc)
-- return total number of GPUs allocated to the job
-- there are many ways to request a GPU. This comes from the job_submit example in the slurm source
-- a GPU resource is either nil or "gres:gpu:N", with N the number of GPUs requested
-- pick relevant job resources for GPU spec (undefined resources can show limit values)
gpu_specs = {
['tres_per_node'] = 1,
['tres_per_task'] = 1,
['tres_per_socket'] = 1,
['tres_per_job'] = 1,
}
-- number of nodes
if job_desc['min_nodes'] < 0xFFFFFFFE then gpu_specs['tres_per_node'] = job_desc['min_nodes'] end
-- number of tasks
if job_desc['num_tasks'] < 0xFFFFFFFE then gpu_specs['tres_per_task'] = job_desc['num_tasks'] end
-- number of sockets
if job_desc['sockets_per_node'] < 0xFFFE then gpu_specs['tres_per_socket'] = job_desc['sockets_per_node'] end
gpu_specs['tres_per_socket'] = gpu_specs['tres_per_socket'] * gpu_specs['tres_per_node']
gpu_options = {}
for tres_name, _ in pairs(gpu_specs) do
local num_gpus = string.match(tostring(job_desc[tres_name]), "^gres:gpu:([0-9]+)") or 0
gpu_options[tres_name] = tonumber(num_gpus)
end
-- calculate total GPUs
for tres_name, job_res in pairs(gpu_specs) do
local num_gpus = gpu_options[tres_name]
if num_gpus > 0 then
total_gpus = num_gpus * tonumber(job_res)
return total_gpus
end
end
return 0
end
function slurm_job_submit(job_desc, part_list, submit_uid)
local total_gpus = job_total_gpus(job_desc)
slurm.log_debug("Job total number of GPUs: %s", tostring(total_gpus));
if total_gpus == 0 then
for partition in string.gmatch(tostring(job_desc.partition), '([^,]+)') do
if string.match(partition, '_gpu$') then
slurm.log_user(string.format('ERROR: GPU partition %s is not allowed for non-GPU jobs.', partition))
return ESLURM_INVALID_GRES
end
end
end
return slurm.SUCCESS
end
Ward
On 29/03/2023 01:24, Frank Pari wrote:
> Well, I wanted to avoid using lua. But, it looks like that's going to be the easiest way to do this without having to create a separate partition for the GPUs. Basically, check for at least one gpu in the job submission and if none exclude all GPU nodes for the job.
>
> image.png
>
> Now I'm wondering how to auto-gen the list of nodes with GPUs, so I don't have to remember to update job_submit.lua everytime we get new GPU nodes.
>
> -F
>
> On Tue, Mar 28, 2023 at 4:06 PM Frank Pari <parif at bc.edu <mailto:parif at bc.edu>> wrote:
>
> Hi all,
>
> First, thank you all for participating in this list. I've learned so much by just following in other's threads. =)
>
> I'm looking at creating a scavenger partition with idle resources from CPU and GPU nodes and I'd like to keep this to one partition. But, I don't want CPU only jobs using up resources on the GPU nodes.
>
> I've seen suggestions for job/lua scripts. But, I'm wondering if there's any other way to ensure a job has requested at least 1 gpu for the scheduler to assign that job to a GPU node.
>
> Thanks in advance!
>
> -Frank
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4741 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230329/8baed843/attachment.bin>
More information about the slurm-users
mailing list