[slurm-users] Reserving slots with sbatch and OpenMpi
Mccall, Kurt E. (MSFC-EV41)
kurt.e.mccall at nasa.gov
Mon Mar 14 11:17:58 UTC 2022
Yes, sorry. It is
mpirun -wdir "." ./parent
I expected mpirun to pick up the job parameters from the SLURM_* environment variables created by sbatch.
Thanks,
Kurt
From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Ralph Castain
Sent: Friday, March 11, 2022 3:48 PM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [EXTERNAL] Re: [slurm-users] Reserving slots with sbatch and OpenMpi
I assume you are running the job via mpirun? Can you share the mpirun cmd line?
On Mar 11, 2022, at 11:31 AM, Mccall, Kurt E. (MSFC-EV41) <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>> wrote:
With sbatch, what is the proper way to launch 5 tasks each on a single node, but reserve two slots on each node so that the original tasks can each create one new process using MPI_Comm_spawn?
I’ve tried various combinations of the sbatch arguments –nodes, --ntasks-per-node and –cpu-per-node, but all attempts result in this OpenMpi error message:
“All nodes which are allocated for this job are already filled.”
I expected the proper arguments to be –nodes=5 --ntasks=5 –cpus-per-task=2.
The 5 original processes are created correctly, but it seems like MPI_Comm_spawn is causing the error message when it tries to allocate a CPU.
I’m using slurm 20.11.8 and OpenMpi 4.1.2.
Thanks,
Kurt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220314/3d03e4e9/attachment-0001.htm>
More information about the slurm-users
mailing list