[slurm-users] Help with developing a lua job submit script

Baker D.J. D.J.Baker at soton.ac.uk
Wed Oct 10 07:27:27 MDT 2018


Hello,


Thank you for your useful replies. It certainly not anywhere as difficult as I initially thought. We should be able to start some tests later this week.


Best regards,

David


________________________________
From: slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of Roche Ewan <ewan.roche at epfl.ch>
Sent: 10 October 2018 08:07
To: Slurm User Community List
Subject: Re: [slurm-users] Help with developing a lua job submit script

Hello David,
for this use case we have two partitions - serial and parallel (the default). Our lua looks like:


function slurm_job_submit(job_desc, part_list, submit_uid)
-- As the default partition is set later by SLURM we need to set it here using the same logic

        if job_desc.partition == nil then
                local default_partition = "parallel"
                job_desc.partition = default_partition
        end

        if job_desc.partition == "parallel" and job_desc.min_nodes == 1 then

                if job_desc.min_cpus <= 12 and job_desc.shared ~= 0 then

                        local serial_partition = "serial"
                        job_desc.partition = serial_partition

                        slurm.log_info("slurm_job_modify: for user %u , setting partition: %s", submit_uid, serial_partition)
                end
        end

        return slurm.SUCCESS
end


The initial partition logic is to avoid jobs that were submitted to specific partitions such as debug being re-routed by the plugin. Our cutoff is 12 cores (half a node) and we respect the “exclusive" option by looking at the value of job_desc.shared.

Good luck!

Ewan Roche
EPFL SCITAS


> On 9 Oct 2018, at 17:54, Baker D.J. <D.J.Baker at soton.ac.uk> wrote:
>
> Hello,
>
> We are starting to think about developing a lua job submission script. For example, we are keen to route jobs requiring no more than 1 compute node (single core jobs and small parallel jobs) to a slurm shared partition. The idea being that "small" jobs can share a small set of compute nodes to try to prevent resource fragmentation in the cluster. Other "larger" jobs are routed to the default partition.
>
> As a start I have dug out the example lua script provided by slurm, however I wondered if there was any experience in doing this sort of routing using a lua script. We would appreciate any advice from the slurm community, please.
>
> Best regards,
> David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20181010/6e31b197/attachment.html>


More information about the slurm-users mailing list