[slurm-users] Slurmrestd error on slurmdb request

Philippe Noel philippe.noel at loria.fr
Fri Dec 3 13:50:20 UTC 2021


Hello,

I'm trying to send request to my slurmrestd server to get all jobs:

$ curl localhost:6820/slurmdb/v0.0.36/jobs --header "X-SLURM-USER-NAME: slurm" --header "X-SLURM-USER-TOKEN: e...sM" -i
HTTP/1.1 200 OK
Content-Length: 430
Content-Type: application/json

{
    "meta": {
      "plugin": {
        "type": "openapi\/dbv0.0.36",
        "name": "REST DB v0.0.36"
      },
      "Slurm": {
        "version": {
          "major": 20,
          "micro": 8,
          "minor": 11
        },
        "release": "20.11.8"
      }
    },
    "errors": [
      {
        "error_number": 1007,
        "error": "Protocol authentication error",
        "source": "slurmdb_jobs_get"
      }
    ],
    "jobs": [
    ]
  }

The service run with

# /etc/systemd/system/slurmrestd.service
[Unit]
Description=Slurm REST daemon
After=network.target munge.service slurmctld.service
ConditionPathExists=/etc/slurm//slurm.conf
Documentation=man:slurmrestd(8)

[Service]
Type=simple
EnvironmentFile=-/etc/default/slurmrestd
# Default to local auth via socket
#ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS unix:/run/slurmrestd.socket -a rest_auth/local -vvv
# Uncomment to enable listening mode
Environment="SLURM_JWT=daemon"
ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS 127.0.0.1:6820 -a rest_auth/jwt -vvv
ExecReload=/bin/kill -HUP $MAINPID

[Install]
WantedBy=multi-user.target

In logs, I have the following:

Nov 26 10:28:05 backend systemd[1]: Started Slurm REST daemon.
Nov 26 10:28:05 backend slurmrestd[17981]: debug2: _establish_config_source: using config_file=/etc/slurm/slurm.conf (default)
Nov 26 10:28:05 backend slurmrestd[17981]: debug:  slurm_conf_init: using config_file=/etc/slurm/slurm.conf
Nov 26 10:28:05 backend slurmrestd[17981]: debug:  Reading slurm.conf file: /etc/slurm/slurm.conf
Nov 26 10:28:05 backend slurmrestd[17981]: debug:  NodeNames=cluster setting Sockets=6 based on CPUs(6)/(CoresPerSocket(1)/ThreadsPerCore(1))
Nov 26 10:28:05 backend slurmrestd[17981]: debug:  Ignoring obsolete CacheGroups option.

Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug:  auth/jwt: init: JWT authentication plugin loaded
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug:  parse_http: [[localhost]:44228] Accepted HTTP connection
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug:  _on_url: [[localhost]:44228] url path: /slurmdb/v0.0.36/jobs query: (null)
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value: [[localhost]:44228] Header: Host Value: localhost:6820
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value: [[localhost]:44228] Header: User-Agent Value: curl/7.64.0
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value: [[localhost]:44228] Header: Accept Value: */*
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value: [[localhost]:44228] Header: X-SLURM-USER-NAME Value: slurm
Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value: [[localhost]:44228] Header: X-SLURM-USER-TOKEN Value: e...sM
Nov 26 10:29:15 backend-lola slurmrestd[17981]: operations_router: [[localhost]:44228] GET /slurmdb/v0.0.36/jobs
Nov 26 10:29:15 backend-lola slurmrestd[17981]: accounting_storage/slurmdbd: init: Accounting storage SLURMDBD plugin loaded
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_persist_conn_open: Something happened with the receiving/processing of the persistent connection init message to localhost:6819: Failed to unpack SLURM_PERSIST_INIT message
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: Sending PersistInit msg: No error
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: g_slurm_auth_pack: protocol_version 6500 not supported
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_send_node_msg: g_slurm_auth_pack: REQUEST_PERSIST_INIT has  authentication error: Operation now in progress
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_persist_conn_open: failed to send persistent connection init message to localhost:6819
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: Sending PersistInit msg: Protocol authentication error
Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: DBD_GET_JOBS_COND failure: Unspecified error

I don't understand what I am missing. Can you help me ?

Resource like /slurm/v0.0.36/jobs works well, but I need the full list 
of jobs and it's only provided by /slurmdb/v0.0.36/jobs

Philippe N.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20211203/bb4be57f/attachment.htm>


More information about the slurm-users mailing list