[slurm-users] unable to run on all the logical cores

William Brown william at signalbox.org.uk
Mon Oct 12 03:55:13 UTC 2020


Using Parallel or similar will be the easiest and most efficient, but does require that you have control of the R code.  There are different approaches for work within a single node or across many nodes, with the latter having a set-up and tear-down cost so the workload within the loop must be worthwhile.

 

I think if you configure Slurm to misrepresent the hardware you might find odd things happen elsewhere.

 

We always disable hyper-threading on the compute nodes, at the recommendation of the company that installed it, who reported that it ran faster that way.  Multi-threading has a cost and is not ideal for compute workloads.  It is better for things like web servers where some tasks are waiting IO or user input.  That said, it likely depends on the CPU and what is true for Intel might not be true for AMD.

 

 

 

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of David Bellot
Sent: 12 October 2020 01:14
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] unable to run on all the logical cores

 

Indeed, it makes sense now. However, if I launch many R processes using the "parallel" package, I can easily have all the "logical" cores running. In the background, if I'm correct ,R will "fork" and not create a thread. So we have independent processes. On a 20 cores CPU for example, I have 40 "logical" cores and all the cores are running, according to htop.

 

With Slurm, I can't reproduce the same behavior even if I use the SelectTypeParameters=CR_CPU.

 

So, is there a config to tune, an option to use in "sbatch" to achieve the same result, or should I rather launch 20 jobs per node and have each job split in two internally (using "parallel" or "future" for example)?

 

On Thu, Oct 8, 2020 at 6:32 PM William Brown <william at signalbox.org.uk <mailto:william at signalbox.org.uk> > wrote:

R is single threaded.

 

On Thu, 8 Oct 2020, 07:44 Diego Zuccato, <diego.zuccato at unibo.it <mailto:diego.zuccato at unibo.it> > wrote:

Il 08/10/20 08:19, David Bellot ha scritto:

> good spot. At least, scontrol show job is now saying that each job only
> requires one "CPU", so it seems all the cores are treated the same way now.
> Though I still have the problem of not using more than half the cores.
> So I suppose it might be due to the way I submit (batchtools in this
> case) the jobs.
Maybe R is generating single-threaded code? In that case, only a single
process can run on a given core at a time (processes does not share
memory map, threads do, and on Intel CPUs there's a single MMU per core,
not one per thread as in some AMDs).

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786




 

-- 



 <https://www.lifetrading.com.au/> 

			


David Bellot

	
			
Head of Quantitative Research




 




A. Suite B, Level 3A, 43-45 East Esplanade, Manly, NSW 2095

	

E.  <mailto:david.bellot at lifetrading.com.au> david.bellot at lifetrading.com.au

	



P. (+61) 0405 263012

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20201012/810f668a/attachment-0001.htm>


More information about the slurm-users mailing list