[slurm-users] slurmstepd: error: Exceeded job memory limit at some point.

Geert Kapteijns ghkapteijns at gmail.com
Wed Feb 14 04:21:21 MST 2018


Hi everyone,

I’m running into out-of-memory errors when I specify an array job. Needless
to say, 100M should be more than enough, and increasing the allocated
memory to 1G doesn't solve the problem. I call my script as follows: sbatch
--array=100-199 run_batch_job. run_batch_job contains

#!/bin/env bash#SBATCH --partition=lln#SBATCH
--output=/home/user/outs/%x.out.%a#SBATCH
--error=/home/user/outs/%x.err.%a#SBATCH --cpus-per-task=1#SBATCH
--mem-per-cpu=100M#SBATCH --time=2-00:00:00

srun my_program.out $SLURM_ARRAY_TASK_ID

Instead of using --mem-per-cpu and --cpus-per-task, I’ve also tried the
following:

#SBATCH --mem=100M#SBATCH --ntasks=1  # Number of cores#SBATCH
--nodes=1  # All cores on one machine

But in both cases for some of the runs, I get the error:

slurmstepd: error: Exceeded job memory limit at some point.
srun: error: obelix-cn002: task 0: Out Of Memory
slurmstepd: error: Exceeded job memory limit at some point.

I’ve also posted the question on stackoverflow
<https://stackoverflow.com/questions/48763851/how-to-specify-memory-per-process-in-an-array-job-in-slurm>.
Does anyone know what is happening here?

Kind regards,
Geert Kapteijns
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180214/74017ab3/attachment.html>


More information about the slurm-users mailing list