[slurm-users] Exposing only requested CPUs to a job on a given node.
Ryan Cox
ryan_cox at byu.edu
Fri May 14 22:52:04 UTC 2021
You can check with something like this inside of a job: cat
/sys/fs/cgroup/cpuset/slurm/uid_$UID/job_$SLURM_JOB_ID/cpuset.cpus. That
lists which cpus you have access to.
On 5/14/21 4:40 PM, Renfro, Michael wrote:
>
> Untested, but prior experience with cgroups indicates that if things
> are working correctly, even if your code tries to run as many
> processes as you have cores, those processes will be confined to the
> cores you reserve.
>
> Try a more compute-intensive worker function that will take some
> seconds or minutes to complete, and watch the reserved node with 'top'
> or a similar program. If for example, the job reserved only 1 core and
> tried to run 20 processes, you'd see 20 processes in 'top', each at 5%
> CPU time.
>
> To make the code a bit more polite, you can import the os module and
> create a new variable from the SLURM_CPUS_ON_NODE environment variable
> to guide Python into starting the correct number of processes:
>
> cpus_reserved = int(os.environ['SLURM_CPUS_ON_NODE'])
>
> *From: *slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf
> of Rodrigo Santibáñez <rsantibanez.uchile at gmail.com>
> *Date: *Friday, May 14, 2021 at 5:17 PM
> *To: *Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject: *Re: [slurm-users] Exposing only requested CPUs to a job on
> a given node.
>
> *External Email Warning*
>
> *This email originated from outside the university. Please use caution
> when opening attachments, clicking links, or responding to requests.*
>
> ------------------------------------------------------------------------
>
> Hi you all,
>
> I'm replying to have notifications answering this question. I have a
> user whose python script used almost all CPUs, but configured to use
> only 6 cpus per task. I reviewed the code, and it doesn't have an
> explicit call to multiprocessing or similar. So the user is unaware of
> this behavior (and also me).
>
> Running slurm 20.02.6
>
> Best!
>
> On Fri, May 14, 2021 at 1:37 PM Luis R. Torres <lrtorres at gmail.com
> <mailto:lrtorres at gmail.com>> wrote:
>
> Hi Folks,
>
> We are currently running on SLURM 20.11.6 with cgroups constraints
> for memory and CPU/Core. Can the scheduler only expose the
> requested number of CPU/Core resources to a job? We have some
> users that employ python scripts with the multi processing
> modules, and the scripts apparently use all of the CPU/Cores in a
> node, despite using options to constraint a task to just a given
> number of CPUs. We would like several multiprocessing jobs to
> run simultaneously on the nodes, but not step on each other.
>
> The sample script I use for testing is below; I'm looking for
> something similar to what can be done with the GPU Gres
> configuration where only the number of GPUs requested are exposed
> to the job requesting them.
>
> #!/usr/bin/env python3
>
> import multiprocessing
>
> def worker():
>
> print("Worker on CPU #%s" % multiprocessing.current_process
>
> ().name)
>
> result=0
>
> for j in range(20):
>
> result += j**2
>
> print ("Result on CPU {} is {}".format(multiprocessing.curr
>
> ent_process().name,result))
>
> return
>
> if __name__ == '__main__':
>
> pool = multiprocessing.Pool()
>
> jobs = []
>
> print ("This host exposed {} CPUs".format(multiprocessing.c
>
> pu_count()))
>
> for i in range(multiprocessing.cpu_count()):
>
> p = multiprocessing.Process(target=worker, name=i).star
>
> t()
>
> Thanks,
>
> --
>
> ----------------------------------------
> Luis R. Torres
>
--
Ryan Cox
Director
Office of Research Computing
Brigham Young University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20210514/2b0ec48a/attachment-0001.htm>
More information about the slurm-users
mailing list