[slurm-users] cpus-per-task behaviour of srun after 22.05

Ryan Novosielski novosirj at rutgers.edu
Sun Oct 22 17:19:34 UTC 2023


What we say at our site is that you should use srun, if you don’t use srun, you will see limited, if any, output on resource usage in the various places you can see it (sacct, etc), and I learned recently that sattach won’t work either. I find it’s also easier to make mistakes with resource use if you don’t.

We also recommend using it to launch MPI jobs, instead of mpirun/mpiexec/etc. and that is our supported means of operation/the way all of the centrally built MPI stacks work.

Sent from my iPhone

On Oct 22, 2023, at 12:52, Jason Simms <jsimms1 at swarthmore.edu> wrote:


Hello Michael,

I don't have an elegant solution, but I'm writing mostly to +1 this. I didn't catch this in the release notes but am concerned if it is indeed the new behavior. Researchers use scripts that rely on --cpus-per-task (or -c) as part of, e.g., SBATCH directives. I suppose you could simply include something like this, unless someone knows why it wouldn't work, but even if so it seems inelegant:

SRUN_CPUS_PER_TASK = $SLURM_CPUS_PER_TASK

A related question I have, which has come up a couple of times in various other contexts, is truly understanding the difference, in a submit script, between including srun and not, for example:

srun myscript
myscript

People have asked whether srun is required, or what the difference is if it is not included, and honestly it seems like the common reply is that "it doesn't matter that much." But, nobody that I've seen (and I've not done an exhaustive search) has articulated whether it actually matters to use srun within a batch script. Because if this is now the behavior, it appears that simply not using srun will still permit the task to use --cpus-per-task.

Warmest regards,
Jason

On Fri, Oct 20, 2023 at 5:00 AM Michael Müller <Michael.Mueller12 at tu-dresden.de<mailto:Michael.Mueller12 at tu-dresden.de>> wrote:
Hello,

I haven't really seen this discussed anywhere, but maybe I didn't look
in the right places.

After our upgrade from 21.08 to 23.02 we had users complaining about
srun not using the specified --cpus-per-task given in sbatch-directives.
The changelog of 22.05 mentions this change and explains the need to set
the Environment variable SRUN_CPUS_PER_TASK. The environment variable
SLURM_CPUS_PER_TASK will be set by the sbatch-directive, but is ignored
by srun.

Does anyone know why this behaviour was changed? Imo the expectation
that an sbatch-directive is the default for the whole job-script is
reasonable.

Is there a config option to reenable the old behaviour, or do we have to
find a workaround with a job_submit script or a profile.d script? If so,
have any of you already implemented such a workaround?


With kind regards
Michael

--
Michael Müller
Application Developer

Dresden University of Technology
Center of Information Services and High Performance Computing (ZIH)
Department of Interdisciplinary Application Development and Coordination (IAK)
01062 Dresden

phone: (0351)463-35261
www:www.tu-dresden.de/zih<http://www.tu-dresden.de/zih>



--
Jason L. Simms, Ph.D., M.P.H.
Manager of Research Computing
Swarthmore College
Information Technology Services
(610) 328-8102
Schedule a meeting: https://calendly.com/jlsimms
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20231022/9486404e/attachment-0001.htm>


More information about the slurm-users mailing list