[slurm-users] RFC: Slurm Tool to Automate and Track Large Job Arrays

Erik Surface erik.surface at gmail.com
Fri Jan 18 17:15:17 UTC 2019


Hi, I am a slurm end-user needing to run ~250k jobs, each takes ~2-4 hrs.
With the traffic on our cluster and a limit of 7000 job submissions at a
time, it will take about a month to run the full set, if we are lucky.

I built a generic tool (in bash, currently) that automates the tracking and
submission of jobs on the system. More info here:
https://github.com/esurface/smanage

Are there other tools like this in the wild? Is this something helpful to
this community or its end-users? Would it be worth building it out in a
more digestible form ('C', API, etc.)?

Thanks,
Erik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190118/21b1a168/attachment.html>


More information about the slurm-users mailing list