[slurm-users] how to configure correctly node and memory when a script fails with out of memory

Gérard Henry (AMU) gerard.henry at univ-amu.fr
Mon Oct 30 14:53:42 UTC 2023


if i try to request just nodes and memory, for instance:
#SBATCH -N 2
#SBATCH --mem=0
to resquest all memory on a node, and 2nodes seem sufficient for a 
program that consumes 100GB, i ot this error:
sbatch: error: CPU count per node can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration 
is not available

thanks

On 30/10/2023 15:46, Gérard Henry (AMU) wrote:
> Hello all,
> 
> 
> I can't configure the slurm script correctly. My program needs 100GB of 
> memory, it's the only criteria. But the job always fails with an out of 
> memory.
> Here's the cluster configuration I'm using:
> 
> SelectType=select/cons_res
> SelectTypeParameters=CR_Core_Memory
> 
> partition:
> DefMemPerCPU=5770 MaxMemPerCPU=5778
> TRES=cpu=5056,mem=30020000M,node=158
> for each node: CPUAlloc=32 RealMemory=190000 AllocMem=184640
> 
> my script contains:
> #SBATCH -N 5
> #SBATCH --ntasks=60
> #SBATCH --mem-per-cpu=1500M
> #SBATCH --cpus-per-task=1
> ...
> mpirun ../zsimpletest_analyse
> 
> when it fails, sacct gives the follwing information:
> JobID           JobName    Elapsed      NCPUS   TotalCPU    CPUTime 
> ReqMem     MaxRSS  MaxDiskRead MaxDiskWrite      State ExitCode
> ------------ ---------- ---------- ---------- ---------- ---------- 
> ---------- ---------- ------------ ------------ ---------- --------
> 8500578        analyse5   00:03:04         60   02:57:58   03:04:00 
> 90000M                                      OUT_OF_ME+    0:125
> 8500578.bat+      batch   00:03:04         16  46:34.302   00:49:04 
>         21465736K        0.23M        0.01M OUT_OF_ME+    0:125
> 8500578.0         orted   00:03:05         44   02:11:24   02:15:40 
>            40952K        0.42M        0.03M  COMPLETED      0:0
> 
> i don't understand why MaxRSS=21M leads to "out of memory" with 16cpus 
> and 1500M per cpu (24M)
> 
> if anybody can help?
> 
> thanks in advance
> 

-- 
Gérard HENRY
Institut Fresnel - UMR 7249
+33 413945457
Aix-Marseille Université - Campus Etoile, BATIMENT FRESNEL, Avenue 
Escadrille Normandie Niemen, 13013 Marseille
Site : https://fresnel.fr/
Afin de respecter l'environnement, merci de n'imprimer cet email que si 
nécessaire.



More information about the slurm-users mailing list