[slurm-users] ntasks and cpus-per-task
Loris Bennett
loris.bennett at fu-berlin.de
Fri Feb 23 03:50:14 MST 2018
Hi Chris,
Christopher Benjamin Coffey <Chris.Coffey at nau.edu> writes:
> Hi Loris,
>
>> But that's only the case if the program is started with srun or some
>> form of mpirun. Otherwise the program just gets started once on one
>> core and the other cores just idle.
>
> Yes, maybe that’s true about what you say when not using srun. I'm not
> sure, as we tell everyone to use srun to launch every type of task.
OK, I'm confused now. Our main culprit for producing processes with
incorrect affinity is ORCA [1]. It uses OpenMPI but also likes to start
processes asynchronously via SSH within the node set. Our users run
their jobs via batch files containing, say
#SBATCH --ntasks=8
...
$ORCA_PATH/orca ...
However, if I run an ORCA job with 'srun', i.e.
#SBATCH --ntasks=8
...
srun $ORCA_PATH/orca ...
this results in the program being run 8 times with all of them writing
to the same log and output files.
Is ORCA just a pathological exception to the idea that it's always good
to use 'srun'? (As it causes well over 95% of our affinity problems, it
is already pathological in that sense.)
Cheers,
Loris
Footnotes:
[1] https://orcaforum.cec.mpg.de/
--
Dr. Loris Bennett (Mr.)
ZEDAT, Freie Universität Berlin Email loris.bennett at fu-berlin.de
More information about the slurm-users
mailing list