[slurm-users] Transparently assign different walltime limit to a group of nodes ?

Shenglong Wang sw77 at nyu.edu
Mon Aug 13 07:46:35 MDT 2018


Please try to use SLURM Lua plugin, setup two partitions, one for n06-n10 and one for all nodes, inside SLURM Lua plugs, you can assign jobs to different partitions based on requested wall time.

Best,
Shenglong

> On Aug 13, 2018, at 9:44 AM, Cyrus Proctor <cproctor at tacc.utexas.edu> wrote:
> 
> Hi Jens,
> 
> Check out https://slurm.schedmd.com/reservations.html <https://slurm.schedmd.com/reservations.html> specifically the " Reservations Floating Through Time" section. In your case, set a walltime of 14 days for your partition that contains n[01-10]. Then, create a floating reservation on node n[06-10] for n + 1 day where "n" is always evaluated as now.
> 
> If you wish to allow the user more control, then specify a "Feature" in slurm.conf for you nodes. Something like:
> NodeName=n[01-05] Sockets=1 CoresPerSocket=48 ThreadsPerCore=2 State=UNKNOWN Feature=long
> NodeName=n[06-10] Sockets=1 CoresPerSocket=48 ThreadsPerCore=2 State=UNKNOWN Feature=short
> 
> The feature is an arbitrary string that the admin sets. Then a user could specify in their submission as something like:
> sbatch --constraint="long|short" batch.slurm
> 
> Best,
> Cyrus
> 
> On 08/13/2018 08:28 AM, Loris Bennett wrote:
>> Hi Jens,
>> 
>> Jens Dreger <jens.dreger at physik.fu-berlin.de> <mailto:jens.dreger at physik.fu-berlin.de> writes:
>> 
>>> Hi everyone!
>>> 
>>> Is it possible to transparently assign different walltime limits
>>> to nodes without forcing users to specify partitions when submitting
>>> jobs?
>>> 
>>> Example: let's say I have 10 nodes. Nodes n01-n05 should be available
>>> for jobs with a walltime up to 14 days, while n06-n10 should only
>>> be used for jobs with a walltime limit less then 1 day. Then as long
>>> as nodes n06-n10 have free resources, jobs with walltime <1day should
>>> be scheduled to these nodes. If n06-n10 are full, jobs with walltime
>>> <1day should start on n01-n05. Users should not have to specify
>>> partitions.
>>> 
>>> Would this even be possible to do with just one partition much
>>> like nodes with different memory size using weights to fill nodes
>>> with less memoery first?
>>> 
>>> Background of this question is that it would be helpfull to be able
>>> to lower the walltime for a rack of nodes, e.g. when adding this rack
>>> to an existing cluster in order to be able to easily shut down just
>>> this rack after one day in case of instabilities. Much like adding
>>> N nodes to a cluster without changing anything else and have only
>>> jobs with walltime <1day on thiese nodes in the beginning.
>> If you just want to reduce the allowed wall-time for a given rack, can't
>> you just use a maintenance reservation for the appropriate set of nodes?
>> 
>> Cheers,
>> 
>> Loris
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180813/ac5b7341/attachment-0001.html>


More information about the slurm-users mailing list