[slurm-users] Sshare -l segfaults
Christopher Benjamin Coffey
Chris.Coffey at nau.edu
Fri Jul 12 22:14:04 UTC 2019
Hi All,
Has anyone had issues with sshare segfaulting? Specifically with "sshare -l"? Any suggestions on how to figure this one out? Maybe there is something obvious I'm not seeing. This has been happening for many slurm versions, I can't recall when it started. For the last couple versions I've hoped that the bug would be patched but it hasn't.
We are currently running slurm version 18.08.7 with Fair_Tree enabled on Centos 6.10.
See below output from a "sshare -l":
.... snipped ...
dickson jja8 128 0.500000 4205 0.000030 0.999998 0.391061 0.500001 cpu=0,mem=0,energy=0,node=0,b+
dickson lz62 128 0.500000 0 0.000000 0.000002 0.392924 2.1169e+05 cpu=0,mem=0,energy=0,node=0,b+
*** glibc detected *** sshare: free(): invalid next size (fast): 0x00000000013512f0 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3ddba75e5e]
/lib64/libc.so.6[0x3ddba78cf0]
/usr/lib64/slurm/libslurmfull.so(slurm_xfree+0x1d)[0x3f5a75f65f]
/usr/lib64/slurm/libslurmfull.so(print_fields_double+0x209)[0x3f5a6b1c3b]
sshare(process+0x55c)[0x40266d]
sshare[0x402abb]
sshare(main+0xa1f)[0x4034f8]
/lib64/libc.so.6(__libc_start_main+0x100)[0x3ddba1ed20]
sshare[0x401e99]
======= Memory map: ========
00400000-00405000 r-xp 00000000 fd:04 4710 /usr/bin/sshare
00605000-00608000 rw-p 00005000 fd:04 4710 /usr/bin/sshare
012ab000-01372000 rw-p 00000000 00:00 0 [heap]
3ddb600000-3ddb620000 r-xp 00000000 fd:00 196 /lib64/ld-2.12.so
3ddb820000-3ddb821000 r--p 00020000 fd:00 196 /lib64/ld-2.12.so
3ddb821000-3ddb822000 rw-p 00021000 fd:00 196 /lib64/ld-2.12.so
3ddb822000-3ddb823000 rw-p 00000000 00:00 0
3ddba00000-3ddbb8b000 r-xp 00000000 fd:00 306 /lib64/libc-2.12.so
3ddbb8b000-3ddbd8a000 ---p 0018b000 fd:00 306 /lib64/libc-2.12.so
3ddbd8a000-3ddbd8e000 r--p 0018a000 fd:00 306 /lib64/libc-2.12.so
3ddbd8e000-3ddbd90000 rw-p 0018e000 fd:00 306 /lib64/libc-2.12.so
3ddbd90000-3ddbd94000 rw-p 00000000 00:00 0
3ddbe00000-3ddbe83000 r-xp 00000000 fd:00 5655 /lib64/libm-2.12.so
3ddbe83000-3ddc082000 ---p 00083000 fd:00 5655 /lib64/libm-2.12.so
3ddc082000-3ddc083000 r--p 00082000 fd:00 5655 /lib64/libm-2.12.so
3ddc083000-3ddc084000 rw-p 00083000 fd:00 5655 /lib64/libm-2.12.so
3ddc200000-3ddc202000 r-xp 00000000 fd:00 2425 /lib64/libdl-2.12.so
3ddc202000-3ddc402000 ---p 00002000 fd:00 2425 /lib64/libdl-2.12.so
3ddc402000-3ddc403000 r--p 00002000 fd:00 2425 /lib64/libdl-2.12.so
3ddc403000-3ddc404000 rw-p 00003000 fd:00 2425 /lib64/libdl-2.12.so
3ddc600000-3ddc617000 r-xp 00000000 fd:00 1865 /lib64/libpthread-2.12.so
3ddc617000-3ddc817000 ---p 00017000 fd:00 1865 /lib64/libpthread-2.12.so
3ddc817000-3ddc818000 r--p 00017000 fd:00 1865 /lib64/libpthread-2.12.so
3ddc818000-3ddc819000 rw-p 00018000 fd:00 1865 /lib64/libpthread-2.12.so
3ddc819000-3ddc81d000 rw-p 00000000 00:00 0
3ddca00000-3ddca08000 r-xp 00000000 fd:04 160337 /usr/lib64/libhistory.so.6.0
3ddca08000-3ddcc07000 ---p 00008000 fd:04 160337 /usr/lib64/libhistory.so.6.0
3ddcc07000-3ddcc08000 rw-p 00007000 fd:04 160337 /usr/lib64/libhistory.so.6.0
3ddfa00000-3ddfa16000 r-xp 00000000 fd:00 5664 /lib64/libgcc_s-4.4.7-20120601.so.1
3ddfa16000-3ddfc15000 ---p 00016000 fd:00 5664 /lib64/libgcc_s-4.4.7-20120601.so.1
3ddfc15000-3ddfc16000 rw-p 00015000 fd:00 5664 /lib64/libgcc_s-4.4.7-20120601.so.1
3de2600000-3de263a000 r-xp 00000000 fd:00 3438 /lib64/libreadline.so.6.0
3de263a000-3de283a000 ---p 0003a000 fd:00 3438 /lib64/libreadline.so.6.0
3de283a000-3de2842000 rw-p 0003a000 fd:00 3438 /lib64/libreadline.so.6.0
3de2842000-3de2843000 rw-p 00000000 00:00 0
3de7200000-3de721d000 r-xp 00000000 fd:00 2401 /lib64/libtinfo.so.5.7
3de721d000-3de741c000 ---p 0001d000 fd:00 2401 /lib64/libtinfo.so.5.7
3de741c000-3de7420000 rw-p 0001c000 fd:00 2401 /lib64/libtinfo.so.5.7
3de7420000-3de7421000 rw-p 00000000 00:00 0
3de9e00000-3de9e22000 r-xp 00000000 fd:00 534 /lib64/libncurses.so.5.7
3de9e22000-3dea021000 ---p 00022000 fd:00 534 /lib64/libncurses.so.5.7
3dea021000-3dea022000 rw-p 00021000 fd:00 534 /lib64/libncurses.so.5.7
3f5a600000-3f5a7be000 r-xp 00000000 fd:04 301565 /usr/lib64/slurm/libslurmfull.so
3f5a7be000-3f5a9be000 ---p 001be000 fd:04 301565 /usr/lib64/slurm/libslurmfull.so
3f5a9be000-3f5a9c7000 rw-p 001be000 fd:04 301565 /usr/lib64/slurm/libslurmfull.so
3f5a9c7000-3f5a9cc000 rw-p 00000000 00:00 0
7fedc4000000-7fedc4021000 rw-p 00000000 00:00 0
7fedc4021000-7fedc8000000 ---p 00000000 00:00 0
7fedca80e000-7fedca816000 r-xp 00000000 fd:04 194120 /usr/lib64/libmunge.so.2.0.0
7fedca816000-7fedcaa16000 ---p 00008000 fd:04 194120 /usr/lib64/libmunge.so.2.0.0
7fedcaa16000-7fedcaa17000 rw-p 00008000 fd:04 194120 /usr/lib64/libmunge.so.2.0.0
7fedcaa17000-7fedcaa1a000 r-xp 00000000 fd:04 301563 /usr/lib64/slurm/auth_munge.so
7fedcaa1a000-7fedcac19000 ---p 00003000 fd:04 301563 /usr/lib64/slurm/auth_munge.so
7fedcac19000-7fedcac1a000 rw-p 00002000 fd:04 301563 /usr/lib64/slurm/auth_munge.so
7fedcac1a000-7fedcac56000 r--s 00000000 fd:02 866 /var/db/nscd/hosts
7fedcac56000-7fedcae0e000 r--s 00000000 fd:02 856 /var/db/nscd/passwd
7fedcae0e000-7fedcae14000 rw-p 00000000 00:00 0
7fedcae2d000-7fedcae30000 rw-p 00000000 00:00 0
7ffd741fa000-7ffd7420f000 rw-p 00000000 00:00 0 [stack]
7ffd74365000-7ffd74366000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
doerry 1024 0.007435 0 0.000000 0.000000 2.8788e+24 Aborted
If you have any thoughts, please cc me on the reply.
Best,
Chris
—
Christopher Coffey
High-Performance Computing
Northern Arizona University
928-523-1167
More information about the slurm-users
mailing list