[slurm-users] fail when trying to set up selection=con_res
Ethan Van Matre
vanmatre at ohsu.edu
Wed Nov 29 13:32:33 MST 2017
Here is some more data:
Changed slurm.conf to have
SelectType=select/cons_res
SelectTypeParameters=CR_CPU
Then restarted
sudo systemctl restart slurmctld.service
The log on the host said:
[2017-11-29T12:23:56.384] error: we don't have select plugin type 101
[2017-11-29T12:23:56.384] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:23:56.384] error: Malformed RPC of type REQUEST_ABORT_JOB(6013) received
[2017-11-29T12:23:56.384] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
Then did a sudo scontrol reconfigure and the log said:
[2017-11-29T12:23:56.394] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:24:34.889] Message aggregation disabled
[2017-11-29T12:24:34.890] Resource spec: Reserved system memory limit not configured for this node
Sview had running jobs cleard out of its context (they are still running) But I kinda expect that.
I then submitted 6 jobs to the partition that do nothing but sleep and the log says:
[2017-11-29T12:25:39.424] error: we don't have select plugin type 101
[2017-11-29T12:25:39.424] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.424] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.424] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.424] error: we don't have select plugin type 101
[2017-11-29T12:25:39.424] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.424] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.424] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.424] error: we don't have select plugin type 101
[2017-11-29T12:25:39.424] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.424] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.424] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.424] error: we don't have select plugin type 101
[2017-11-29T12:25:39.424] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.424] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.424] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.425] error: we don't have select plugin type 101
[2017-11-29T12:25:39.425] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.425] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.425] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.425] error: we don't have select plugin type 101
[2017-11-29T12:25:39.425] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.425] error: Malformed RPC of type REQUEST_BATCH_JOB_LAUNCH(4005) received
[2017-11-29T12:25:39.425] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.434] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.434] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.434] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.434] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.435] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.435] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.436] error: we don't have select plugin type 101
[2017-11-29T12:25:39.436] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.436] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.436] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.436] error: we don't have select plugin type 101
[2017-11-29T12:25:39.436] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.436] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.436] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.436] error: we don't have select plugin type 101
[2017-11-29T12:25:39.436] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.436] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.436] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.436] error: we don't have select plugin type 101
[2017-11-29T12:25:39.436] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.436] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.436] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.436] error: we don't have select plugin type 101
[2017-11-29T12:25:39.436] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.436] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.436] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.437] error: we don't have select plugin type 101
[2017-11-29T12:25:39.437] error: select_g_select_jobinfo_unpack: unpack error
[2017-11-29T12:25:39.437] error: Malformed RPC of type REQUEST_TERMINATE_JOB(6011) received
[2017-11-29T12:25:39.437] error: slurm_receive_msg_and_forward: Header lengths are longer than data received
[2017-11-29T12:25:39.446] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.446] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.446] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.446] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.447] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
[2017-11-29T12:25:39.447] error: service_connection: slurm_receive_msg: Header lengths are longer than data received
Lastly changes the config back to linear and restarted reconfigured and the node log says:
[2017-11-29T12:26:19.617] [6684.0] job_manager exiting with aborted job
[2017-11-29T12:26:19.621] [6684.0] done with job
[2017-11-29T12:26:24.591] Message aggregation disabled
[2017-11-29T12:26:24.592] Resource spec: Reserved system memory limit not configured for this node
Ethan VanMatre
Informatics Research Analyst
Institute on Development and Disability
Oregon Health & Science University
CSLU - GH40
3181 SW Sam Jackson Park Rd
Portland, OR 97239
(503) 346-3764
vanmatre at ohsu.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20171129/875b0241/attachment-0001.html>
More information about the slurm-users
mailing list