[slurm-users] Stripped binaries and parallel debugging

Pär Lindfors paran at nsc.liu.se
Fri Jun 15 06:35:47 MDT 2018


Hi,

Slurm's spec file disable RPM's normal behaviour where symbols are
extracted and shipped in a separate debuginfo RPM. There is a comment
that this is done to avoid breaking parallel debugging.

Does anybody know what parallel debugging use case this refers to?

I did a small test and stripped all files from Slurm packages on a few
compute nodes, and could still successfully use Allinea DDT to launch
and debug an MPI application using srun and PMI2.

The relevant part of slurm.spec is:
  #
  # Never allow rpm to strip binaries as this will break
  #  parallel debugging capability
  # Note that brp-compress does not compress man pages installed
  #  into non-standard locations (e.g. /usr/local)
  #
  %define __os_install_post /usr/lib/rpm/brp-compress
  %define debug_package %{nil}

This have been there since Slurm 0.3.8 in 2004, commit
c327536db0ae54af71789b693a0c7413479d3bf0.

If some symbols are needed, RPM can be configured to use "strip -g",
which only removes debugging symbols but leave anything else. Other
symbols don't use any significant amount of space anyway. This is done
using:

  %global _find_debuginfo_opts -g

The reason I looked into this is that a few of our nodes recently ran
out of disk space, and I was a bit surprised when realizing that our
Slurm packages used over 500 MB. That cluster is using Slurm 17.02, and
the package size appear to depend a lot on how we have built our packages.

With Slurm 17.11 the size drops dramatically thanks to the dynamic
linking of libslurmfull.so, making the disk usage not really be an
issue. But I am still curious if anything actually depends on having the
debugging symbols.

Regards,
Pär Lindfors,
NSC



More information about the slurm-users mailing list