[slurm-users] Transparently assign different walltime limit to a group of nodes ?

Jens Dreger jens.dreger at physik.fu-berlin.de
Mon Aug 13 10:20:46 MDT 2018


Hi Cyrus!

On Mon, Aug 13, 2018 at 08:44:15AM -0500, Cyrus Proctor wrote:
> Hi Jens,
> 
> Check out https://slurm.schedmd.com/reservations.html specifically the "
> Reservations Floating Through Time" section. In your case, set a walltime of 14
> days for your partition that contains n[01-10]. Then, create a floating
> reservation on node n[06-10] for n + 1 day where "n" is always evaluated as
> now.

This is just perfect! Thank you!

Jens.

> 
> If you wish to allow the user more control, then specify a "Feature" in
> slurm.conf for you nodes. Something like:
> NodeName=n[01-05] Sockets=1 CoresPerSocket=48 ThreadsPerCore=2 State=UNKNOWN
> Feature=long
> NodeName=n[06-10] Sockets=1 CoresPerSocket=48 ThreadsPerCore=2 State=UNKNOWN
> Feature=short
> 
> The feature is an arbitrary string that the admin sets. Then a user could
> specify in their submission as something like:
> sbatch --constraint="long|short" batch.slurm
> 
> Best,
> Cyrus
> 
> On 08/13/2018 08:28 AM, Loris Bennett wrote:
> 
>     Hi Jens,
> 
>     Jens Dreger <jens.dreger at physik.fu-berlin.de> writes:
> 
> 
>         Hi everyone!
> 
>         Is it possible to transparently assign different walltime limits
>         to nodes without forcing users to specify partitions when submitting
>         jobs?
> 
>         Example: let's say I have 10 nodes. Nodes n01-n05 should be available
>         for jobs with a walltime up to 14 days, while n06-n10 should only
>         be used for jobs with a walltime limit less then 1 day. Then as long
>         as nodes n06-n10 have free resources, jobs with walltime <1day should
>         be scheduled to these nodes. If n06-n10 are full, jobs with walltime
>         <1day should start on n01-n05. Users should not have to specify
>         partitions.
> 
>         Would this even be possible to do with just one partition much
>         like nodes with different memory size using weights to fill nodes
>         with less memoery first?
> 
>         Background of this question is that it would be helpfull to be able
>         to lower the walltime for a rack of nodes, e.g. when adding this rack
>         to an existing cluster in order to be able to easily shut down just
>         this rack after one day in case of instabilities. Much like adding
>         N nodes to a cluster without changing anything else and have only
>         jobs with walltime <1day on thiese nodes in the beginning.
> 
>     If you just want to reduce the allowed wall-time for a given rack, can't
>     you just use a maintenance reservation for the appropriate set of nodes?
> 
> 
>     Loris
> 
> 
> 

-- 
Jens Dreger                      Freie Universitaet Berlin
dreger at physik.fu-berlin.de       Fachbereich Physik - ZEDV
Tel: +49 30 83854774             Arnimallee 14
Fax: +49 30 838454774	         14195 Berlin



More information about the slurm-users mailing list