[slurm-users] Exposing only requested CPUs to a job on a given node.

Thu Jul 1 22:26:29 UTC 2021

Hi Luis,

I have exactly the same issue with a user who needs the reported cores to
reflect the requested cores. If you find a solution that works please
share. :)

Thanks

Sid Young
Translational Research Institute

Sid Young
W: https://off-grid-engineering.com
W: (personal) https://sidyoung.com/
W: (personal) https://z900collector.wordpress.com/

On Fri, Jul 2, 2021 at 8:14 AM Luis R. Torres <lrtorres at gmail.com> wrote:

> Hi Folks,
>
> Thank you for your responses, I wrote the following configuration in
> cgroup.conf along the appropriate slurm.conf
> changes and I wrote a program to verify affinity whe queued or running in
> the cluster.  results are below.  Thanks so much.
>
> ###
>
> #
>
> # Slurm cgroup support configuration file
>
> #
>
> # See man slurm.conf and man cgroup.conf for further
>
> # information on cgroup configuration parameters
>
> #--
>
> CgroupAutomount=yes
>
> CgroupMountpoint=/sys/fs/cgroup
>
> #ConstrainCores=no
>
> ConstrainCores=yes
>
> ConstrainRAMSpace=yes
>
> ConstrainDevices=no
>
> ConstrainKmemSpace=no #Avoid a known kernel issue
>
> ConstrainSwapSpace=yes
>
> TaskAffinity=no #Use task/affinity plugin instead
> -----
>
> srun --tasks=1 --cpus-per-task=1 --partition=long show-affinity.py
>
> pid 1122411's current affinity mask: *401*
>
>
> =====================================
>
> CPUs in system:  20
>
> PID:  1122411
>
> Allocated CPUs/Cores:  *2*
>
> Affinity List:  *{0, 10}*
>
> =====================================
>
> srun --tasks=1 --cpus-per-task=4 --partition=long show-affinity.py
>
> pid 1122446's current affinity mask: *c03*
>
>
> =====================================
>
> CPUs in system:  20
>
> PID:  1122446
>
> Allocated CPUs/Cores:  *4*
>
> Affinity List:  *{0, 1, 10, 11}*
>
> =====================================
>
>
> srun --tasks=1 --cpus-per-task=6 --partition=long show-affinity.py
>
> pid 1122476's current affinity mask: *1c07*
>
>
> =====================================
>
> CPUs in system:  20
>
> PID:  1122476
>
> Allocated CPUs/Cores:  *6*
>
> Affinity List:  *{0, 1, 2, 10, 11, 12}*
>
> =====================================
>
> On Fri, May 14, 2021 at 1:35 PM Luis R. Torres <lrtorres at gmail.com> wrote:
>
>> Hi Folks,
>>
>> We are currently running on SLURM 20.11.6 with cgroups constraints for
>> memory and CPU/Core.  Can the scheduler only expose the requested number of
>> CPU/Core resources to a job?  We have some users that employ python scripts
>> with the multi processing modules, and the scripts apparently use all of
>> the CPU/Cores in a node, despite using options to constraint a task to just
>> a given number of CPUs.    We would like several multiprocessing jobs to
>> run simultaneously on the nodes, but not step on each other.
>>
>> The sample script I use for testing is below; I'm looking for something
>> similar to what can be done with the GPU Gres configuration where only the
>> number of GPUs requested are exposed to the job requesting them.
>>
>>
>> #!/usr/bin/env python3
>>
>> import multiprocessing
>>
>>
>> def worker():
>>
>>     print("Worker on CPU #%s" % multiprocessing.current_process
>>
>> ().name)
>>
>>     result=0
>>
>>     for j in range(20):
>>
>>       result += j**2
>>
>>     print ("Result on CPU {} is {}".format(multiprocessing.curr
>>
>> ent_process().name,result))
>>
>>     return
>>
>>
>> if __name__ == '__main__':
>>
>>     pool = multiprocessing.Pool()
>>
>>     jobs = []
>>
>>     print ("This host exposed {} CPUs".format(multiprocessing.c
>>
>> pu_count()))
>>
>>     for i in range(multiprocessing.cpu_count()):
>>
>>         p = multiprocessing.Process(target=worker, name=i).star
>>
>> t()
>>
>> Thanks,
>> --
>> ----------------------------------------
>> Luis R. Torres
>>
>
>
> --
> ----------------------------------------
> Luis R. Torres
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210702/d1a1b535/attachment-0003.htm>