I can confirm that after update to recently released 24.05.2 the API endpoint 

GET /slurm/v0.0.41/jobs

works now well.

cheers

josef

From: Daniel Letai via slurm-users <slurm-users@lists.schedmd.com>
Sent: Wednesday, 24 July 2024 19:29
To: slurm-users@lists.schedmd.com <slurm-users@lists.schedmd.com>
Subject: [slurm-users] Re: slumrestd 24.05.1: crashes when GET on /slurm/v0.0.41/nodes : unsorted double linked list corrupted
 

This is a know issue and resolved in 24.05.2 in the patches labeled "Always allocate pointers despite skipping parsing"

For example:

https://github.com/SchedMD/slurm/commit/5b07b6bda407431215606b93e57d0a9b7f4c9b53


The same patch also applies to 0.0.40 and 0.0.42



On 24/07/2024 15:53:13, Josef Dvořáček via slurm-users wrote:
Isn't this failure familiar to anyone?

When I ask API endpoint "localhost:6820/slurm/v0.0.41/jobs", slurmrestd segrafults with unsorted double linked list corrupted.

Anyone using this API endpoint without segfaulting?

I do the get using curl:

curl --header X-SLURM-USER-NAME:root --header X-SLURM-USER-TOKEN:$SLURM_JWT -G localhost:6820/slurm/v0.0.41/jobs


In comparison,

curl --header X-SLURM-USER-NAME:root --header X-SLURM-USER-TOKEN:$SLURM_JWT -G localhost:6820/slurm/v0.0.41/nodes

Works well.


josef




čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug:  _on_url: [[localhost]:52909] url path: /slurm/v0.0.41/jobs query: (null)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: operations_router: [[localhost]:52909] GET /slurm/v0.0.41/jobs
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: slurmrestd: operations_router: [[localhost]:52909] GET /slurm/v0.0.41/jobs
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: slurmrestd: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:52909] attempting user_name root token authentication pass through
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: rest_auth/jwt: slurm_rest_auth_p_authenticate: [[localhost]:52909] attempting user_name root token authentication pass through
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: skip non-matching subdirectories: registered=1 requested=3
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["openapi.json"](0, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: skip non-matching subdirectories: registered=1 requested=3
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["openapi.yaml"](1, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: skip non-matching subdirectories: registered=1 requested=3
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["openapi"](2, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: skip non-matching subdirectories: registered=2 requested=3
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["openapi","v3"](3, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match shares to jobs: FAILURE
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed shares
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","shares"](4, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match reconfigure to jobs: FAILURE
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed reconfigure
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","reconfigure"](5, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match diag to jobs: FAILURE
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed diag
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","diag"](6, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match ping to jobs: FAILURE
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed ping
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","ping"](7, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match licenses to jobs: FAILURE
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed licenses
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","licenses"](8, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: method skip for ["slurm","v0.0.41","job","submit"](9, GET != POST) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","job","submit"](9, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: method skip for ["slurm","v0.0.41","job","allocate"](10, GET != POST) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match failed for ["slurm","v0.0.41","job","allocate"](10, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match slurm to slurm: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match v0.0.41 to v0.0.41: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path: string attempt match jobs to jobs: SUCCESS
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _match_path_from_data: match successful for ["slurm","v0.0.41","jobs"](11, GET) to ["slurm","v0.0.41","jobs"](0x7F9C64001CB0)
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: operations_router: [[localhost]:52909] found callback handler: (0x0) callback_tag=0 path=/slurm/v0.0.41/jobs parser=data_parser/v0.0.41
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: _resolve_mime: [[localhost]:52909] did not provide a known content type header. Assuming URL encoded.
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug5: _parse_http_accept_entry: found */* with q=1.000000
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: _resolve_mime: [[localhost]:52909] accepts */* with q=1.000000
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: _resolve_mime: [[localhost]:52909] found accepts */*=application/json with q=1.000000
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug3: _resolve_mime: [[localhost]:52909] mime read: application/x-www-form-urlencoded write: application/json
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug3: _call_handler: [[localhost]:52909] BEGIN: calling ctxt handler: 0x7F9C9D294A36[0] for path: /slurm/v0.0.41/jobs
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug:  wrap_openapi_ctxt_callback: [[localhost]:52909] GET using data_parser/v0.0.41
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: xsignal: Swap signal PIPE[13] to 0x1 from 0x408376
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: xsignal: Swap signal PIPE[13] to 0x408376 from 0x1
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug:  accounting_storage/slurmdbd: _connect_dbd_conn: Sent PersistInit msg
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: xsignal: Swap signal PIPE[13] to 0x1 from 0x408376
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: debug4: xsignal: Swap signal PIPE[13] to 0x408376 from 0x1
čec 24 14:37:55 slurmserver2.koios.lan slurmrestd[1502900]: malloc(): unsorted double linked list corrupted
čec 24 14:37:55 slurmserver2.koios.lan systemd[1]: Started Process Core Dump (PID 1502951/UID 0).



-- 
Regards,

Daniel Letai
+972 (0)505 870 456