[slurm-users] cpu limit issue

Renfro, Michael Renfro at tntech.edu
Wed Jul 11 08:22:52 MDT 2018


Looking at your script, there’s a chance that by only specifying ntasks instead of ntasks-per-node or a similar parameter, you might have allocated 8 CPUs on one node, and the remaining 4 on another.

Regardless, I’ve dug into my Gaussian documentation, and here’s my test case for you to see what happens:

1. Make a copy of tests/com/test1044.com from the Gaussian main directory.
2. Reserve some number of cores on a single node. I’m using --cpus-per-task=N instead of --ntasks-per-node, but it might not matter. Regardless, I try to stick with the cpus-per-task format for OpenMP-type programs. My job script is:

=====

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=28

module load gaussian
export GAUSS_PDEF=${SLURM_CPUS_PER_TASK}
export GAUSS_SCRDIR=$(mktemp -d -p /local/tmp)
g09 test1044.com
rm -r ${GAUSS_SCRDIR}

=====

Obviously, you’ll need to modify module load commands, and you might not need the GAUSS_SCRDIR variable. The GAUSS_PDEF variable does the same thing as the NProc commands, but doesn’t require modifying the input file.

With that job script, and that test file, I max out all 28 cores in my node for certain parts of the calculations as seen in ’top'. Job takes about 21 minutes of CPU time. Your times will obviously vary.

Depending on what you find out from the test case, that’ll give some insight on where you should go next.

> On Jul 11, 2018, at 4:04 AM, Mahmood Naderan <mahmood.nt at gmail.com> wrote:
> 
> My fault. One of the other nodes was in my mind!
> 
> The node which is running g09 is
> 
> 
> [root at compute-0-3 ~]# ps  aux |  grep l502
> root     11198  0.0  0.0 112664   968 pts/0    S+   13:31   0:00 grep --color=auto l502
> nooriza+ 30909  803  1.4 21095004 947968 ?     Rl   Jul10 6752:47 /usr/local/chem/g09-64-D01/l502.exe 2415919104
> [root at compute-0-3 ~]# lscpu
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                32
> On-line CPU(s) list:   0-31
> Thread(s) per core:    2
> Core(s) per socket:    8
> Socket(s):             2
> NUMA node(s):          4
> Vendor ID:             AuthenticAMD
> CPU family:            21
> Model:                 1
> Model name:            AMD Opteron(tm) Processor 6282 SE
> Stepping:              2
> 
> 
> Regards,
> Mahmood
> 
> 
> 
> On Wed, Jul 11, 2018 at 1:25 PM, John Hearns <hearnsj at googlemail.com> wrote:
> Mahmood, please please forgive me for saying this.  A quick Google shows that Opteron 61xx have eight or twelve cores.
> Have you checked that all the servers have 12 cores?
> I realise I am appearing stupid here.
> 
> 
> 
> 



More information about the slurm-users mailing list