[slurm-users] Runaway jobs issue: : Resource temporarily unavailable, slurm 17.11.3

Chris Samuel chris at csamuel.org
Tue Apr 24 23:47:17 MDT 2018


On Wednesday, 25 April 2018 5:59:38 AM AEST Christopher Benjamin Coffey wrote:

> #define MAX_MSG_SIZE     (16*1024*1024)

That is really really strange, there are 4 different definitions of that
symbol in the source code.

$ git grep 'define MAX_MSG_SIZE'
src/common/slurm_persist_conn.c:#define MAX_MSG_SIZE     (16*1024*1024)
src/common/slurm_protocol_socket_implementation.c:#define MAX_MSG_SIZE     (1024*1024*1024)
src/plugins/mpi/pmix/pmixp_debug.h:#define MAX_MSG_SIZE 1024
src/slurmdbd/rpc_mgr.c:#define MAX_MSG_SIZE     (16*1024*1024)

Now the PMIX one is a separate one altogether, but I wonder if there
has been accidental redefinitions (for instance the one in slurm_persist_conn.c
didn't appear until 2016, whilst the one in slurm_protocol_socket_implementation.c
was set to that value (1GB) back in 2013.

I'll open a bug just in case..

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC




More information about the slurm-users mailing list