[slurm-users] How to partition nodes into smaller units

Ansgar Esztermann-Kirchner aeszter at mpibpc.mpg.de
Tue Feb 5 15:46:35 UTC 2019


Hello List,

we're operating a large-ish cluster (about 900 nodes) with diverse
hardware. It has been running with SGE for several years now, but the
more we refine our configuration, the more we're feeling SGE's
limitations.
Therefore, we're considering switching to Slurm. 

The latest challenge is this: a certain class of nodes has been
optimized for small jobs -- we'd like to have two "half nodes", where
jobs will be able to use one of the two GPUs, plus (at most) half of
the CPUs. With SGE, we've put two queues on the nodes, but this
effectively prevents certain maintenance jobs from running.

How would I configure these nodes in Slurm? From the docs I gathered
that MaxTRESPerJob would be a solution, but this is coupled to
associations, which I do not fully understand. 
Is this the best/only way to achieve such a partioning? 
If so, do I need to define an association for every user, or can I
define a default/skeleton association that new users automatically
inherit?
Are there other/better ways to go?

Thanks a lot,

A.
-- 
Ansgar Esztermann
Sysadmin
http://www.mpibpc.mpg.de/grubmueller/esztermann
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3762 bytes
Desc: not available
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190205/9848ce42/attachment.bin>


More information about the slurm-users mailing list