Hello,
Thanks for your answers. I will try now!! One more question: is there any way to check if Cgroups restrictions is working fine during a "running" job or during SLURM scheduling process?
Thanks again!
Cgroups don’t take effect until the job has started;. It’s a bit clunky, but you can do things like this
inspect_job_cgroup_memory ()
{
set -- $(squeue "$@" -O JobId,UserName | sed -n '$p');
sudo -u $2 srun --pty --jobid "$1" bash -c 'cat /sys/fs/cgroup/memory/slurm/uid_$(id -u)/job_${SLURM_JOB_ID}/memory.usage_in_bytes'
}
There are lots of other files in that filesystem hierarchy to report on other things like cpusets, IO etc.
Obviously if you’re not the admin of the system, you can only do this for your own jobs, and then you don’t need the sudo part of the shell function.
Tim
-- Tim Cutts Senior Director, R&D IT - Data, Analytics & AI, Scientific Computing Platform AstraZeneca
Find out more about R&D IT Data, Analytics & AI and how we can support you by visiting our Service Cataloguehttps://azcollaboration.sharepoint.com/sites/CMU993 |
From: Gestió Servidors via slurm-users slurm-users@lists.schedmd.com Date: Wednesday, 26 March 2025 at 7:50 am To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [slurm-users] Re: Using more cores/CPUs that requested with Hello,
Thanks for your answers. I will try now!! One more question: is there any way to check if Cgroups restrictions is working fine during a “running” job or during SLURM scheduling process?
Thanks again!
________________________________
AstraZeneca UK Limited is a company incorporated in England and Wales with registered number:03674842 and its registered office at 1 Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge, CB2 0AA.
This e-mail and its attachments are intended for the above named recipient only and may contain confidential and privileged information. If they have come to you in error, you must not copy or show them to anyone; instead, please reply to this e-mail, highlighting the error to the sender and then immediately delete the message. For information about how AstraZeneca UK Limited and its affiliates may process information, personal data and monitor communications, please see our privacy notice at www.astrazeneca.comhttps://www.astrazeneca.com
In addition to checking under /sys/fs/cgroup like Tim said, if this is just to convince yourself that you got the CPU restriction working, you could also open `top` on the host running the job and observe that %CPU is now being held to 200,0 or lower (or if its multiple processes per job, summing to that) instead of 4800 or whatever all the cores would be.
________________________________________ Od: Cutts, Tim via slurm-users slurm-users@lists.schedmd.com Poslano: sreda, 26. marec 2025 07:32 Za: Gestió Servidors; slurm-users@lists.schedmd.com Zadeva: [slurm-users] Re: Using more cores/CPUs that requested with
Cgroups don’t take effect until the job has started;. It’s a bit clunky, but you can do things like this
inspect_job_cgroup_memory () { set -- $(squeue "$@" -O JobId,UserName | sed -n '$p'); sudo -u $2 srun --pty --jobid "$1" bash -c 'cat /sys/fs/cgroup/memory/slurm/uid_$(id -u)/job_${SLURM_JOB_ID}/memory.usage_in_bytes' }
There are lots of other files in that filesystem hierarchy to report on other things like cpusets, IO etc.
Obviously if you’re not the admin of the system, you can only do this for your own jobs, and then you don’t need the sudo part of the shell function.
Tim
[...]
From: Gestió Servidors via slurm-users slurm-users@lists.schedmd.com Date: Wednesday, 26 March 2025 at 7:50 am To: slurm-users@lists.schedmd.com slurm-users@lists.schedmd.com Subject: [slurm-users] Re: Using more cores/CPUs that requested with
Hello,
Thanks for your answers. I will try now!! One more question: is there any way to check if Cgroups restrictions is working fine during a “running” job or during SLURM scheduling process?
Thanks again!
________________________________
[...]
If you are letting systemd taking most things over, you got systemd-cgtop that work better than top for your case. There is also systemd-cgls for non-interactive listing.
Also mind to check if you are using cgroup2? A mount to check your cgroup would suffice. As cgroup is likely not supposed to be used in newer deployments of Slurm.
2025年3月26日(水) 17:14 Gestió Servidors via slurm-users < slurm-users@lists.schedmd.com>:
Hello,
Thanks for your answers. I will try now!! One more question: is there any way to check if Cgroups restrictions is working fine during a “running” job or during SLURM scheduling process?
Thanks again!
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-leave@lists.schedmd.com