[slurm-users] Segfault with 32 processes, OK with 30 ???

Riebs, Andy andy.riebs at hpe.com
Tue Oct 6 11:45:26 UTC 2020


> The problem is with a single, specific, node: str957-bl0-03 . The same
> job script works if being allocated to another node, even with more
> ranks (tested up to 224/4 on mtx-* nodes).

Ahhh... here's where the details help. So it appears that the problem is on a single node, and probably not a general configuration or system problem. I suggest starting with  something like this to help figure out why node bl0-03 is different

$ sudo ssh str957- bl0-02 lscpu
$ sudo ssh str957- bl0-03 lscpu

Andy

-----Original Message-----
From: Diego Zuccato [mailto:diego.zuccato at unibo.it] 
Sent: Tuesday, October 6, 2020 3:13 AM
To: Riebs, Andy <andy.riebs at hpe.com>; Slurm User Community List <slurm-users at lists.schedmd.com>
Subject: Re: [slurm-users] Segfault with 32 processes, OK with 30 ???

Il 05/10/20 14:18, Riebs, Andy ha scritto:

Tks for considering my query.

> You need to provide some hints! What we know so far:
> 1. What we see here is a backtrace from (what looks like) an Open MPI/PMI-x backtrace.
Correct.

> 2. Your decision to address this to the Slurm mailing list suggests that you think that Slurm might be involved.
At least I couldn't replicate launching manually (it always says "no
slots available" unless I use mpirun -np 16 ...). I'm no MPI expert
(actually less than a noob!) so I can't rule out it's unrelated to
Slurm. I mostly hope that on this list I can find someone with enough
experience with both Slurm and MPI.

> 3. You have something (a job? a program?) that segfaults when you go from 30 to 32 processes.
Multiple programs, actually.

> a. What operating system?
Debian 10.5 . Only extension is PBIS-open to authenticate users from AD.

> b. Are you seeing this while running Slurm? What version?
18.04, Debian packages

> c. What version of Open MPI?
openmpi-bin/stable,now 3.1.3-11 amd64

> d. Are you building your own PMI-x, or are you using what's provided by Open MPI and Slurm?
Using Debian packages

> e. What does your hardware configuration look like -- particularly, what cpu type(s), and how many cores/node?
The node uses dual Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz for a total
of 32 threads (hyperthreading is enabled: 2 sockets, 8 cores per socket,
2 threads per core).

> f. What does you Slurm configuration look like (assuming you're seeing this with Slurm)? I suggest purging your configuration files of node names and IP addresses, and including them with your query.
Here it is:
-8<--
SlurmCtldHost=str957-cluster(*.*.*.*)
AuthType=auth/munge
CacheGroups=0
CryptoType=crypto/munge
#DisableRootJobs=NO
EnforcePartLimits=YES
JobSubmitPlugins=lua
MpiDefault=none
MpiParams=ports=12000-12999
ReturnToService=2
SlurmctldPidFile=/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm/slurmd
SlurmUser=slurm
StateSaveLocation=/var/lib/slurm/slurmctld
SwitchType=switch/none
TaskPlugin=task/cgroup
TmpFS=/mnt/local_data/
UsePAM=1
GetEnvTimeout=20
InactiveLimit=0
KillWait=120
MinJobAge=300
SlurmctldTimeout=20
SlurmdTimeout=30
FastSchedule=0
SchedulerType=sched/backfill
SchedulerPort=7321
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
PriorityFlags=MAX_TRES
PriorityType=priority/multifactor
PreemptMode=CANCEL
PreemptType=preempt/partition_prio
AccountingStorageEnforce=safe,qos
AccountingStorageHost=str957-cluster
#AccountingStorageLoc=
#AccountingStoragePass=
#AccountingStoragePort=6819
#AccountingStorageTRES=
AccountingStorageType=accounting_storage/slurmdbd
#AccountingStorageUser=
AccountingStoreJobComment=YES
AcctGatherNodeFreq=300
ClusterName=oph
JobCompLoc=/var/spool/slurm/jobscompleted.txt
JobCompType=jobcomp/filetxt
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm/slurmd.log
NodeName=DEFAULT Sockets=2 ThreadsPerCore=2 State=UNKNOWN
NodeName=str957-bl0-0[1-2] CoresPerSocket=6 Feature=ib,blade,intel
NodeName=str957-bl0-0[3-5] CoresPerSocket=8 Feature=ib,blade,intel
NodeName=str957-bl0-[15-16] CoresPerSocket=4 Feature=ib,nonblade,intel
NodeName=str957-bl0-[17-18] CoresPerSocket=6 ThreadsPerCore=1
Feature=nonblade,amd
NodeName=str957-bl0-[19-20] Sockets=4 CoresPerSocket=8 ThreadsPerCore=1
Feature=nonblade,amd
NodeName=str957-mtx-[00-15] CoresPerSocket=14 Feature=ib,nonblade,intel
-8<--

> g. What does your command line look like? Especially, are you trying to run 32 processes on a single node? Spreading them out across 2 or more nodes?
The problem is with a single, specific, node: str957-bl0-03 . The same
job script works if being allocated to another node, even with more
ranks (tested up to 224/4 on mtx-* nodes).

> h. Can you reproduce the problem if you substitute `hostname` or `true` for the program in the command line? What about a simple MPI-enabled "hello world?"I'll try ASAP w/ a simple 'hostname'. But I expect it to work.
The original problem is with a complex program run by an user. To try to
debug the issue I'm using what I think is the simplest mpi program possible:
-8<--
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#define  MASTER         0

int main (int argc, char *argv[])
{
  int   numtasks, taskid, len;
  char hostname[MPI_MAX_PROCESSOR_NAME];
  MPI_Init(&argc, &argv);
//  int provided=0;
//  MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
//printf("MPI provided threads: %d\n", provided);
  MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
  MPI_Comm_rank(MPI_COMM_WORLD,&taskid);

  if (taskid == MASTER)
    printf("This is an MPI parallel code for Hello World with no
communication\n");
  //MPI_Barrier(MPI_COMM_WORLD);


  MPI_Get_processor_name(hostname, &len);

  printf ("Hello from task %d on %s!\n", taskid, hostname);

  if (taskid == MASTER)
    printf("MASTER: Number of MPI tasks is: %d\n",numtasks);

  MPI_Finalize();

  printf("END OF CODE from task %d\n", taskid);
}
-8<--
And I got failures with it, too.

-- 
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Università di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786



More information about the slurm-users mailing list