[slurm-users] Exposing only requested CPUs to a job on a given node.
Sid Young
sid.young at gmail.com
Thu Jul 1 22:26:29 UTC 2021
Hi Luis,
I have exactly the same issue with a user who needs the reported cores to
reflect the requested cores. If you find a solution that works please
share. :)
Thanks
Sid Young
Translational Research Institute
Sid Young
W: https://off-grid-engineering.com
W: (personal) https://sidyoung.com/
W: (personal) https://z900collector.wordpress.com/
On Fri, Jul 2, 2021 at 8:14 AM Luis R. Torres <lrtorres at gmail.com> wrote:
> Hi Folks,
>
> Thank you for your responses, I wrote the following configuration in
> cgroup.conf along the appropriate slurm.conf
> changes and I wrote a program to verify affinity whe queued or running in
> the cluster. results are below. Thanks so much.
>
> ###
>
> #
>
> # Slurm cgroup support configuration file
>
> #
>
> # See man slurm.conf and man cgroup.conf for further
>
> # information on cgroup configuration parameters
>
> #--
>
> CgroupAutomount=yes
>
> CgroupMountpoint=/sys/fs/cgroup
>
> #ConstrainCores=no
>
> ConstrainCores=yes
>
> ConstrainRAMSpace=yes
>
> ConstrainDevices=no
>
> ConstrainKmemSpace=no #Avoid a known kernel issue
>
> ConstrainSwapSpace=yes
>
> TaskAffinity=no #Use task/affinity plugin instead
> -----
>
> srun --tasks=1 --cpus-per-task=1 --partition=long show-affinity.py
>
> pid 1122411's current affinity mask: *401*
>
>
> =====================================
>
> CPUs in system: 20
>
> PID: 1122411
>
> Allocated CPUs/Cores: *2*
>
> Affinity List: *{0, 10}*
>
> =====================================
>
> srun --tasks=1 --cpus-per-task=4 --partition=long show-affinity.py
>
> pid 1122446's current affinity mask: *c03*
>
>
> =====================================
>
> CPUs in system: 20
>
> PID: 1122446
>
> Allocated CPUs/Cores: *4*
>
> Affinity List: *{0, 1, 10, 11}*
>
> =====================================
>
>
> srun --tasks=1 --cpus-per-task=6 --partition=long show-affinity.py
>
> pid 1122476's current affinity mask: *1c07*
>
>
> =====================================
>
> CPUs in system: 20
>
> PID: 1122476
>
> Allocated CPUs/Cores: *6*
>
> Affinity List: *{0, 1, 2, 10, 11, 12}*
>
> =====================================
>
> On Fri, May 14, 2021 at 1:35 PM Luis R. Torres <lrtorres at gmail.com> wrote:
>
>> Hi Folks,
>>
>> We are currently running on SLURM 20.11.6 with cgroups constraints for
>> memory and CPU/Core. Can the scheduler only expose the requested number of
>> CPU/Core resources to a job? We have some users that employ python scripts
>> with the multi processing modules, and the scripts apparently use all of
>> the CPU/Cores in a node, despite using options to constraint a task to just
>> a given number of CPUs. We would like several multiprocessing jobs to
>> run simultaneously on the nodes, but not step on each other.
>>
>> The sample script I use for testing is below; I'm looking for something
>> similar to what can be done with the GPU Gres configuration where only the
>> number of GPUs requested are exposed to the job requesting them.
>>
>>
>> #!/usr/bin/env python3
>>
>> import multiprocessing
>>
>>
>> def worker():
>>
>> print("Worker on CPU #%s" % multiprocessing.current_process
>>
>> ().name)
>>
>> result=0
>>
>> for j in range(20):
>>
>> result += j**2
>>
>> print ("Result on CPU {} is {}".format(multiprocessing.curr
>>
>> ent_process().name,result))
>>
>> return
>>
>>
>> if __name__ == '__main__':
>>
>> pool = multiprocessing.Pool()
>>
>> jobs = []
>>
>> print ("This host exposed {} CPUs".format(multiprocessing.c
>>
>> pu_count()))
>>
>> for i in range(multiprocessing.cpu_count()):
>>
>> p = multiprocessing.Process(target=worker, name=i).star
>>
>> t()
>>
>> Thanks,
>> --
>> ----------------------------------------
>> Luis R. Torres
>>
>
>
> --
> ----------------------------------------
> Luis R. Torres
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210702/d1a1b535/attachment-0003.htm>
More information about the slurm-users
mailing list