[slurm-users] What's the best way to suppress core dump files from jobs?

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Wed Mar 21 05:08:00 MDT 2018


We experience problems with MPI jobs dumping lots (1 per MPI task) of 
multi-GB core dump files, causing problems for file servers and compute 
nodes.

The user has "ulimit -c 0" in his .bashrc file, but that's ignored when 
slurmd starts the job, and the slurmd process limits are employed in stead.

I should mention that we have decided to configure slurm.conf with
   PropagateResourceLimitsExcept=ALL
because it's desirable to have rather restrictive user limits on login 
nodes.  Unfortunately, this means that the user's "ulimit -c 0" isn't 
propagated to any batch job.

What's the best way to suppress core dump files from jobs?  Does anyone 
have good or bad experiences?

One working solution is to modify the slurmd Systemd service file 
/usr/lib/systemd/system/slurmd.service to add a line:
   LimitCORE=0
I've documented further details in my Slurm Wiki page 
https://wiki.fysik.dtu.dk/niflheim/Slurm_configuration#slurmd-systemd-limits. 
  However, it's a bit cumbersome to modify the Systemd service file on 
all compute nodes.

Thanks for sharing any experiences.

/Ole



More information about the slurm-users mailing list