[slurm-users] Exposing only requested CPUs to a job on a given node.

Fri May 14 22:52:04 UTC 2021

You can check with something like this inside of a job:  cat 
/sys/fs/cgroup/cpuset/slurm/uid_$UID/job_$SLURM_JOB_ID/cpuset.cpus. That 
lists which cpus you have access to.

On 5/14/21 4:40 PM, Renfro, Michael wrote:
>
> Untested, but prior experience with cgroups indicates that if things 
> are working correctly, even if your code tries to run as many 
> processes as you have cores, those processes will be confined to the 
> cores you reserve.
>
> Try a more compute-intensive worker function that will take some 
> seconds or minutes to complete, and watch the reserved node with 'top' 
> or a similar program. If for example, the job reserved only 1 core and 
> tried to run 20 processes, you'd see 20 processes in 'top', each at 5% 
> CPU time.
>
> To make the code a bit more polite, you can import the os module and 
> create a new variable from the SLURM_CPUS_ON_NODE environment variable 
> to guide Python into starting the correct number of processes:
>
>                 cpus_reserved = int(os.environ['SLURM_CPUS_ON_NODE'])
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf 
> of Rodrigo Santibáñez <rsantibanez.uchile at gmail.com>
> *Date: *Friday, May 14, 2021 at 5:17 PM
> *To: *Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject: *Re: [slurm-users] Exposing only requested CPUs to a job on 
> a given node.
>
> *External Email Warning*
>
> *This email originated from outside the university. Please use caution 
> when opening attachments, clicking links, or responding to requests.*
>
> ------------------------------------------------------------------------
>
> Hi you all,
>
> I'm replying to have notifications answering this question. I have a 
> user whose python script used almost all CPUs, but configured to use 
> only 6 cpus per task. I reviewed the code, and it doesn't have an 
> explicit call to multiprocessing or similar. So the user is unaware of 
> this behavior (and also me).
>
> Running slurm 20.02.6
>
> Best!
>
> On Fri, May 14, 2021 at 1:37 PM Luis R. Torres <lrtorres at gmail.com 
> <mailto:lrtorres at gmail.com>> wrote:
>
>     Hi Folks,
>
>     We are currently running on SLURM 20.11.6 with cgroups constraints
>     for memory and CPU/Core.  Can the scheduler only expose the
>     requested number of CPU/Core resources to a job?  We have some
>     users that employ python scripts with the multi processing
>     modules, and the scripts apparently use all of the CPU/Cores in a
>     node, despite using options to constraint a task to just a given
>     number of CPUs.    We would like several multiprocessing jobs to
>     run simultaneously on the nodes, but not step on each other.
>
>     The sample script I use for testing is below; I'm looking for
>     something similar to what can be done with the GPU Gres
>     configuration where only the number of GPUs requested are exposed
>     to the job requesting them.
>
>     #!/usr/bin/env python3
>
>     import multiprocessing
>
>     def worker():
>
>     print("Worker on CPU #%s" % multiprocessing.current_process
>
>     ().name)
>
>         result=0
>
>         for j in range(20):
>
>           result += j**2
>
>         print ("Result on CPU {} is {}".format(multiprocessing.curr
>
>     ent_process().name,result))
>
>         return
>
>     if __name__ == '__main__':
>
>         pool = multiprocessing.Pool()
>
>         jobs = []
>
>         print ("This host exposed {} CPUs".format(multiprocessing.c
>
>     pu_count()))
>
>         for i in range(multiprocessing.cpu_count()):
>
>             p = multiprocessing.Process(target=worker, name=i).star
>
>     t()
>
>     Thanks,
>
>     -- 
>
>     ----------------------------------------
>     Luis R. Torres
>

-- 
Ryan Cox
Director
Office of Research Computing
Brigham Young University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210514/2b0ec48a/attachment-0001.htm>