[slurm-users] Issue with my hello mpi toy program
Mahmood Naderan
mahmood.nt at gmail.com
Thu Oct 17 07:01:14 UTC 2019
Hi,
I used to run a hello mpi for testing purposes. Now, I see that it doesn't
work. While the log file shows memory allocation problem, squeue shows that
job is in R state endlessly.
[mahmood at hpc ~]$ cat slurm_script1.sh
#!/bin/bash
#SBATCH --job-name=hello_mpi
#SBATCH --output=hellompi.log
#SBATCH --ntasks=4
#SBATCH --time=10:00
#SBATCH --partition=SEA
#SBATCH --account=fish
#SBATCH --mem=100M
mpirun ./mpihello
[mahmood at hpc ~]$ sbatch slurm_script1.sh
Submitted batch job 10
[mahmood at hpc ~]$ cat hellompi.log
[hpc.safaar.com:18059] create_and_attach: unable to create shared memory
BTL coordinating structure :: size 134217728
--------------------------------------------------------------------------
A system call failed during shared memory initialization that should
not have. It is likely that your MPI job will now either abort or
experience performance degradation.
Local host: hpc.safaar.com
System call: mmap(2)
Error: Cannot allocate memory (errno 12)
--------------------------------------------------------------------------
[mahmood at hpc ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
10 SEA hello_mp mahmood R 0:36 1 hpc
[mahmood at hpc ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
10 SEA hello_mp mahmood R 0:43 1 hpc
[mahmood at hpc ~]$ sacctmgr list association
format=partition,account,user,grptres
Partition Account User GrpTRES
---------- ---------- ---------- -------------
root
root root
fish
sea fish mahmood cpu=10,mem=8G
local mahmood
However, the binary file works fine outside of slurm.
Regards,
Mahmood
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20191017/94e0791f/attachment.htm>
More information about the slurm-users
mailing list