[slurm-users] Exposing only requested CPUs to a job on a given node.

Luis R. Torres lrtorres at gmail.com
Thu Jul 1 22:12:11 UTC 2021


Hi Folks,

Thank you for your responses, I wrote the following configuration in
cgroup.conf along the appropriate slurm.conf
changes and I wrote a program to verify affinity whe queued or running in
the cluster.  results are below.  Thanks so much.

###

#

# Slurm cgroup support configuration file

#

# See man slurm.conf and man cgroup.conf for further

# information on cgroup configuration parameters

#--

CgroupAutomount=yes

CgroupMountpoint=/sys/fs/cgroup

#ConstrainCores=no

ConstrainCores=yes

ConstrainRAMSpace=yes

ConstrainDevices=no

ConstrainKmemSpace=no #Avoid a known kernel issue

ConstrainSwapSpace=yes

TaskAffinity=no #Use task/affinity plugin instead
-----

srun --tasks=1 --cpus-per-task=1 --partition=long show-affinity.py

pid 1122411's current affinity mask: *401*


=====================================

CPUs in system:  20

PID:  1122411

Allocated CPUs/Cores:  *2*

Affinity List:  *{0, 10}*

=====================================

srun --tasks=1 --cpus-per-task=4 --partition=long show-affinity.py

pid 1122446's current affinity mask: *c03*


=====================================

CPUs in system:  20

PID:  1122446

Allocated CPUs/Cores:  *4*

Affinity List:  *{0, 1, 10, 11}*

=====================================


srun --tasks=1 --cpus-per-task=6 --partition=long show-affinity.py

pid 1122476's current affinity mask: *1c07*


=====================================

CPUs in system:  20

PID:  1122476

Allocated CPUs/Cores:  *6*

Affinity List:  *{0, 1, 2, 10, 11, 12}*

=====================================

On Fri, May 14, 2021 at 1:35 PM Luis R. Torres <lrtorres at gmail.com> wrote:

> Hi Folks,
>
> We are currently running on SLURM 20.11.6 with cgroups constraints for
> memory and CPU/Core.  Can the scheduler only expose the requested number of
> CPU/Core resources to a job?  We have some users that employ python scripts
> with the multi processing modules, and the scripts apparently use all of
> the CPU/Cores in a node, despite using options to constraint a task to just
> a given number of CPUs.    We would like several multiprocessing jobs to
> run simultaneously on the nodes, but not step on each other.
>
> The sample script I use for testing is below; I'm looking for something
> similar to what can be done with the GPU Gres configuration where only the
> number of GPUs requested are exposed to the job requesting them.
>
>
> #!/usr/bin/env python3
>
> import multiprocessing
>
>
> def worker():
>
>     print("Worker on CPU #%s" % multiprocessing.current_process
>
> ().name)
>
>     result=0
>
>     for j in range(20):
>
>       result += j**2
>
>     print ("Result on CPU {} is {}".format(multiprocessing.curr
>
> ent_process().name,result))
>
>     return
>
>
> if __name__ == '__main__':
>
>     pool = multiprocessing.Pool()
>
>     jobs = []
>
>     print ("This host exposed {} CPUs".format(multiprocessing.c
>
> pu_count()))
>
>     for i in range(multiprocessing.cpu_count()):
>
>         p = multiprocessing.Process(target=worker, name=i).star
>
> t()
>
> Thanks,
> --
> ----------------------------------------
> Luis R. Torres
>


-- 
----------------------------------------
Luis R. Torres
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210701/76abea2c/attachment-0001.htm>


More information about the slurm-users mailing list