[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?

1 Aug 2024


      Bill, would this allow allocating all the remaining harts when the
node is initially half full ? How are the parameters set up for that ?
The cluster has 14 machines with 56 harts and 128 GB RAM and 12
machines with 104 harts and 256 GB RAM.
Some of the algorithms used have hot loops that scale close to or
beyond the number of harts, so it will always be beneficial to use all
harts available in an opportunistic, best-effort way. The algorithms
are for training photometric galaxy redshift estimators (galaxy
distance calculators). Training will be done with a certain frequency
due to the large amount of available physical parameters. The amount
of memory that's being required right now seems to be below 10 GB, but
I can't say for all algorithms that will be used (at least 6 different
ones), nor for different parameters expected to be required.
On Thu, Aug 1, 2024 at 4:27 PM Bill via slurm-users
slurm-users@lists.schedmd.com wrote:
...
Either allocate the whole node's cores or the whole node's memory?  Both
will allocate the node exclusively for you.
So you'll need to know what a node looks like.  For a homogeneous
cluster, this is straightforward.  For a heterogeneous cluster, you may
also need to specify a nodelist for say those 28 core nodes and then
those 64 core nodes.
But going back to the original answer, --exclusive, is the answer here.
You DO know how many cores you need right?  (Scaling study should give
you that).  And you DO know the memory footprint by past jobs with
similar inputs I hope.
Bill
On 8/1/24 3:17 PM, Henrique Almeida via slurm-users wrote:
...
Hello, maybe rephrase the question to fill a whole node ?
On Thu, Aug 1, 2024 at 3:08 PM Jason Simms jsimms1@swarthmore.edu wrote:
...
On the one hand, you say you want "to allocate a whole node for a single multi-threaded process," but on the other you say you want to allow it to "share nodes with other running jobs." Those seem like mutually exclusive requirements.
Jason
On Thu, Aug 1, 2024 at 1:32 PM Henrique Almeida via slurm-users slurm-users@lists.schedmd.com wrote:
...
Hello, I'm testing it right now and it's working pretty well in a
normal situation, but that's not exactly what I want. --exclusive
documentation says that the job allocation cannot share nodes with
other running jobs, but I want to allow it to do so, if that's
unavoidable. Are there other ways to configure it ?
The current parameters I'm testing:
 sbatch -N 1 --exclusive --ntasks-per-node=1 --mem=0 pz-train.batch


On Thu, Aug 1, 2024 at 12:29 PM Davide DelVento
davide.quantum@gmail.com wrote:
...
In part, it depends on how it's been configured, but have you tried --exclusive?
On Thu, Aug 1, 2024 at 7:39 AM Henrique Almeida via slurm-users slurm-users@lists.schedmd.com wrote:
...
Hello, everyone, with slurm, how to allocate a whole node for a
single multi-threaded process?
https://stackoverflow.com/questions/78818547/with-slurm-how-to-allocate-a-wh...
--
  Henrique Dante de Almeida
  hdante@gmail.com
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
--
  Henrique Dante de Almeida
  hdante@gmail.com
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research Computing
Swarthmore College
Information Technology Services
(610) 328-8102
Schedule a meeting: https://calendly.com/jlsimms
--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-leave@lists.schedmd.com
-- 
 Henrique Dante de Almeida
 hdante@gmail.com

2025

2024

[slurm-users] Re: With slurm, how to allocate a whole node for a single multi-threaded process?