[slurm-users] how to configure correctly node and memory when a script fails with out of memory
Gérard Henry (AMU)
gerard.henry at univ-amu.fr
Mon Oct 30 14:53:42 UTC 2023
if i try to request just nodes and memory, for instance:
#SBATCH -N 2
#SBATCH --mem=0
to resquest all memory on a node, and 2nodes seem sufficient for a
program that consumes 100GB, i ot this error:
sbatch: error: CPU count per node can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration
is not available
thanks
On 30/10/2023 15:46, Gérard Henry (AMU) wrote:
> Hello all,
>
>
> I can't configure the slurm script correctly. My program needs 100GB of
> memory, it's the only criteria. But the job always fails with an out of
> memory.
> Here's the cluster configuration I'm using:
>
> SelectType=select/cons_res
> SelectTypeParameters=CR_Core_Memory
>
> partition:
> DefMemPerCPU=5770 MaxMemPerCPU=5778
> TRES=cpu=5056,mem=30020000M,node=158
> for each node: CPUAlloc=32 RealMemory=190000 AllocMem=184640
>
> my script contains:
> #SBATCH -N 5
> #SBATCH --ntasks=60
> #SBATCH --mem-per-cpu=1500M
> #SBATCH --cpus-per-task=1
> ...
> mpirun ../zsimpletest_analyse
>
> when it fails, sacct gives the follwing information:
> JobID JobName Elapsed NCPUS TotalCPU CPUTime
> ReqMem MaxRSS MaxDiskRead MaxDiskWrite State ExitCode
> ------------ ---------- ---------- ---------- ---------- ----------
> ---------- ---------- ------------ ------------ ---------- --------
> 8500578 analyse5 00:03:04 60 02:57:58 03:04:00
> 90000M OUT_OF_ME+ 0:125
> 8500578.bat+ batch 00:03:04 16 46:34.302 00:49:04
> 21465736K 0.23M 0.01M OUT_OF_ME+ 0:125
> 8500578.0 orted 00:03:05 44 02:11:24 02:15:40
> 40952K 0.42M 0.03M COMPLETED 0:0
>
> i don't understand why MaxRSS=21M leads to "out of memory" with 16cpus
> and 1500M per cpu (24M)
>
> if anybody can help?
>
> thanks in advance
>
--
Gérard HENRY
Institut Fresnel - UMR 7249
+33 413945457
Aix-Marseille Université - Campus Etoile, BATIMENT FRESNEL, Avenue
Escadrille Normandie Niemen, 13013 Marseille
Site : https://fresnel.fr/
Afin de respecter l'environnement, merci de n'imprimer cet email que si
nécessaire.
More information about the slurm-users
mailing list