[slurm-users] Slurm and MPICH don't play well together (salloc)

Wed Dec 29 18:55:22 UTC 2021

Antony,

I’m  not sure I understand your answer.   I want to launch 2 tasks (managers), one per node, but reserve the rest of the cores on each node so that the original 2 managers can spawn new workers on them.   Requesting 24 tasks would create 24 managers, I think.

Kurt

From: slurm-users <slurm-users-bounces at lists.schedmd.com> On Behalf Of Antony Cleave
Sent: Tuesday, December 28, 2021 6:15 PM
To: Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: [EXTERNAL] Re: [slurm-users] Slurm and MPICH don't play well together (salloc)

Hi

I've not used mpich for years but I think I see the problem. By asking for 24 CPUs per task and specifying 2 tasks you are asking slurm to allocate 48 CPUs per node.

Your nodes have 24 CPUs in total so you don't have any nodes that can service this request

Try asking for 24 tasks. I've only ever used CPU per task for hybrid MPI/openMP codes with 2 MPI tasks and 12 threads per task.

Antony

On Tue, 28 Dec 2021, 23:02 Mccall, Kurt E. (MSFC-EV41), <kurt.e.mccall at nasa.gov<mailto:kurt.e.mccall at nasa.gov>> wrote:
Hi,

My MPICH jobs are being launched and the desired number of processes are created, but when one of those processes trys to spawn a new process using MPI_Comm_spawn(), that process just spins in the polling code deep within the MPICH library.   See the Slurm error message below.   This all works without problems on other clusters that have Torque as the process manager.   We are using Slurm 20.02.3 on redhat 4.18.0, and MPICH 4.0b1.

salloc: defined options
salloc: -------------------- --------------------
salloc: cpus-per-task       : 24
salloc: ntasks              : 2
salloc: verbose             : 1
salloc: -------------------- --------------------
salloc: end of defined options
salloc: Linear node selection plugin loaded with argument 4
salloc: select/cons_res loaded with argument 4
salloc: Cray/Aries node selection plugin loaded
salloc: select/cons_tres loaded with argument 4
salloc: Granted job allocation 34330
srun: error: Unable to create step for job 34330: Requested node configuration is not availableta

I’m wondering if the salloc command I am using is correct.   I intend for it to launch 2 processes, one per node, but reserve 24 cores on each node for the 2 launched processes to spawn new processes using MPI_Comm_spawn.   Could the reservation of all 24 cores make slurm or MPICH think that there are no more cores available?

salloc –ntasks=2 –cpus-per-task=24 –verbose runscript.bash …

I think that our cluster’s compute nodes are configured correctly –

$ scontrol show node=n001

NodeName=n001 Arch=x86_64 CoresPerSocket=6
   CPUAlloc=0 CPUTot=24 CPULoad=0.00
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=n001 NodeHostName=n001 Version=20.02.3
   OS=Linux 4.18.0-348.el8.x86_64 #1 SMP Mon Oct 4 12:17:22 EDT 2021
   RealMemory=128351 AllocMem=0 FreeMem=126160 Sockets=4 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=normal,low,high
   BootTime=2021-12-21T14:25:05 SlurmdStartTime=2021-12-21T14:25:52
   CfgTRES=cpu=24,mem=128351M,billing=24
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 AveWatts=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Thanks for any help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211229/1edb7828/attachment.htm>