[slurm-users] slurm jobs and and amount of licenses (matlab)
Alois Schlögl
alois.schloegl at ist.ac.at
Mon Sep 26 13:40:41 UTC 2022
Am 9/26/22 um 13:04 schrieb Josef Dvoracek:
> hello @list!
>
> anyone who was dealing with following scenario?
>
> * we have limited amount of Matlab network licenses ( and various
> features have various amount of available seats, eg. machine learning:
> N licenses, Image_Toolbox: M licenses)
> * licenses are being used by slurm jobs and by individual users
> directly at their workstations (workstations are not under my control)
>
> Sometimes it happens, that licenses for certain feature, used in
> particular slurm job is already fully consumed, and job fails.
>
> Is there any straightforward trick how to deal with that? Other than
> buying dedicated pool of licenses for our slurm-based data processing
> facility?
>
> EG. let slurm job wait, until there is required license available?
>
> cheers
>
> josef
>
Hello Josef,
yes, we've similar scenario. There is no straightforward way of handling
this, and slurm configuration can help only to a certain extend.
The main reason for this is that the license usage depends how the jobs
a distributed among nodes, licenses are used per user and per node.
That means a user running 5 jobs on a single node requires 1 license, 5
users running each 1 job require 5 licenses, if 1 user runs 5 jobs on 5
different nodes, it requires also 5 licenses.
If 5 users run 5 jobs on 5 different nodes each, up to 25 licenses might
be needed. If you have 10 users and 20 nodes, you might need up to 200
licenses.
And node-based licenses can not be remotely accessed by multiple users.
Because of that even a dedicated pool with one license per node might
not be sufficient.
Our approach is to restrict the number of nodes were matlab is running,
and identify those nodes with the feature "matlab" that can be selected
with "--constraint=matlab".
Moreover, the largest nodes have matlab so that these jobs run on fewer
nodes, and a smaller number of licenses is needed. You might also want
to enforce -singleCompThread, because the speedup of multithreading is
not always what you expect, especially if you have nodes with large
number of cores.
In addition, we monitor the overall license usage, on a per toolbox
basis, and based on these numbers we decide how many licenses are needed
from which toolbox.
And of somebody is not happy with the limited number of licenses, we
point out that there is also Octave, where the number of available
licenses is not limited, and where the number of cores used can be
controlled with OMP_NUM_THREADS (unlike in Matlab).
Cheers,
Alois
More information about the slurm-users
mailing list