[slurm-users] Multinode MPI job

Marcus Wagner wagner at itc.rwth-aachen.de
Thu Mar 28 07:21:32 UTC 2019


Hi Mahmood,

On 3/28/19 7:33 AM, Mahmood Naderan wrote:
> >srun --pack-group=0 --ntasks=2 : --pack-group=1 --ntasks=4 pw.x -i 
> mos2.rlx.in <http://mos2.rlx.in>
>
> Still only one node is running the processes
no, the processes are running as had been asked for.
>
> $ squeue
>              JOBID PARTITION     NAME     USER ST TIME  NODES 
> NODELIST(REASON)
>              755+1    QUARTZ     myQE   ghatee  R 0:47      1 rocks7
>              755+0    QUARTZ     myQE   ghatee  R 0:47      1 compute-0-2

compute-0-2 is the first pack (755+0), it should run 2 tasks

> $ rocks run host compute-0-2  "ps aux | grep pw.x"
> ghatee     541  0.1  0.0 582048  7604 ?        Sl 02:29   0:00 srun 
> --pack-group=0 --ntasks=2 : --pack-group=1 --ntasks=4 pw.x -i 
> mos2.rlx.in <http://mos2.rlx.in>
> ghatee     542  0.0  0.0  46452   748 ?        S 02:29   0:00 srun 
> --pack-group=0 --ntasks=2 : --pack-group=1 --ntasks=4 pw.x -i 
> mos2.rlx.in <http://mos2.rlx.in>
> ghatee     559 99.6  0.1 1930560 129728 ?      Rl 02:29   0:52 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee     560 99.7  0.1 1930560 129720 ?      Rl 02:29   0:52 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee     590  0.0  0.0 113132  1588 ?        Ss 02:30   0:00 bash -c 
> ps aux | grep pw.x
> ghatee     629  0.0  0.0 112668   960 ?        S 02:30   0:00 grep pw.x

process ids 559 and 560


rocks7 is the second pack (755+1), it should run 4 tasks

> $ rocks run host rocks7  "ps aux | grep pw.x"
> ghatee   16219 99.0  0.1 1930484 127764 ?      Rl 10:59   1:00 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee   16220 99.1  0.1 1930524 127764 ?      Rl 10:59   1:00 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee   16221 99.0  0.1 1930484 127760 ?      Rl 10:59   1:00 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee   16222 99.1  0.1 1930496 127760 ?      Rl 10:59   1:00 
> /home/ghatee/QuantumEspresso621/bin/pw.x -i mos2.rlx.in 
> <http://mos2.rlx.in>
> ghatee   16391  0.0  0.0 316388 26652 pts/16   Sl+ 11:00   0:00 
> /opt/rocks/bin/python /opt/rocks/bin/rocks run host rocks7 ps aux | 
> grep pw.x
> ghatee   16394  0.0  0.0 113132  1368 pts/16   S+ 11:00   0:00 bash -c 
> ps aux | grep pw.x
> ghatee   16396  0.0  0.0 112664   952 pts/16   S+ 11:00   0:00 grep pw.x
>

process ids 16219, 16220, 16221 and 16222

Or did I miss something?


Best
Marcus

>
> Regards,
> Mahmood
>
>
>
>

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner at itc.rwth-aachen.de
www.itc.rwth-aachen.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20190328/75265d0d/attachment.html>


More information about the slurm-users mailing list