<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body dir="auto">
Slurm supports a l3_cache_as_socket [1] parameter in recent releases. That would make an Epyc system, for example, appear to have many more sockets than physically exist, and that should help ensure threads in a single task share a cache.
<div><br>
</div>
<div>You’d want to run slurmd -C on a node with that setting enabled to generate the new NodeName parameters, and replace the old entries in the overall slurm.conf with the updates values.<br>
<div><br>
</div>
<div>[1] <a href="https://slurm.schedmd.com/slurm.conf.html#OPT_l3cache_as_socket">https://slurm.schedmd.com/slurm.conf.html#OPT_l3cache_as_socket</a><br>
<div dir="ltr"><br>
<blockquote type="cite">On Mar 13, 2022, at 1:43 PM, vicentesmith <vicentesmith@protonmail.com> wrote:<br>
<br>
</blockquote>
</div>
<blockquote type="cite">
<div dir="ltr">
<p align="center" style="text-align:center;
background:white;margin:0px"><b><span style="font-size:12.0pt;color:red;background:white;
font-family:"Calibri",sans-serif">External Email Warning</span></b></p>
<p align="center" style="text-align:center;
background:white;margin:0px 12pt"><b><span style="font-size:12.0pt;color:red;font-family:"Calibri",sans-serif">This email originated from outside the university. Please use caution when opening attachments, clicking
links, or responding to requests.</span></b><span style="font-size:12.0pt"></span></p>
<hr>
<div><span>Hello,</span>
<div><span>I'm performing some tests (CPU-only systems) in order to compare MPI versus hybrid setup. The system is running OpenMPIv4.1.2 so that a job submission reads:</span></div>
<div><span> mpirun -np 48 foo.exe</span></div>
<div><span>or </span></div>
<div><span> export OMP_NUM_THREADS=8</span></div>
<div><span> mpirun -np 6 foo.exe</span></div>
<div><span>In our system, the latter runs slightly faster (about 5 to 10%) but any performance gain/loss will depend on the system & app.
</span></div>
<div><span>In the same system and for the same app, the first SLURM script reads:</span></div>
<div><span> #!/bin/bash</span></div>
<div><span> #SBATCH --job-name=***</span></div>
<div><span> #SBATCH --output=* </span></div>
<div><span> #SBATCH --ntasks=48</span></div>
<div><span> mpirun foo.exe </span></div>
<div><span>This script runs fine. Then, and for the hybrid job, the script reads:
</span></div>
<div><span> #!/bin/bash</span></div>
<div><span> #SBATCH --job-name=***hybrid</span></div>
<div><span> #SBATCH --output=*** </span></div>
<div><span> #SBATCH --ntasks=6 </span></div>
<div><span> #SBATCH --cpus-per-task=8 </span></div>
<div><span> export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK</span></div>
<div><span> mpirun foo.exe </span></div>
<div><span>However, this runs much slower and it seems to slow down even more as it moves forward. Something is obviously not clicking correctly for the latter case. My only explanation is that the threads are not forked out correctly (by this I mean that the
8 threads are not assigned to the cores sharing the same L3). OpenMPI is supposed to choose the path of least resistance but I was wondering if I might need to recompile OpenMPI with some extra flags or modify the SLURM script somehow. </span></div>
<div><span>Thanks.</span></div>
<span></span><br>
</div>
</div>
</blockquote>
</div>
</div>
</body>
</html>