[slurm-users] Slurmrestd error on slurmdb request
Schluenzen, Frank
frank.schluenzen at desy.de
Tue Mar 8 10:02:05 UTC 2022
Hi,
I just came across the same issue. Works if slurmdbd allows auth/jwt and has the proper key. /etc/slurm/slurmdbd.conf:
# Authentication info
AuthType=auth/munge
AuthAltTypes=auth/jwt
AuthAltParameters=jwt_key=/etc/slurm/jwt_hs256.key
and /etc/slurm/jwt_hs256.key on the dbd and the slurmrestd hosts have to be identical.
Cheers, Frank
> From: "Philippe Noel" <philippe.noel at loria.fr>
> To: slurm-users at lists.schedmd.com
> Sent: Friday, 3 December, 2021 14:50:20
> Subject: [slurm-users] Slurmrestd error on slurmdb request
> Hello,
> I'm trying to send request to my slurmrestd server to get all jobs:
> $ curl localhost:6820/slurmdb/v0.0.36/jobs --header "X-SLURM-USER-NAME: slurm"
> --header "X-SLURM-USER-TOKEN: e...sM" -i
> HTTP/1.1 200 OK
> Content-Length: 430
> Content-Type: application/json
> {
> "meta": {
> "plugin": {
> "type": "openapi\/dbv0.0.36",
> "name": "REST DB v0.0.36"
> },
> "Slurm": {
> "version": {
> "major": 20,
> "micro": 8,
> "minor": 11
> },
> "release": "20.11.8"
> }
> },
> "errors": [
> {
> "error_number": 1007,
> "error": "Protocol authentication error",
> "source": "slurmdb_jobs_get"
> }
> ],
> "jobs": [
> ]
> }
> The service run with
> # /etc/systemd/system/slurmrestd.service
> [Unit]
> Description=Slurm REST daemon
> After=network.target munge.service slurmctld.service
> ConditionPathExists=/etc/slurm//slurm.conf
> Documentation=man:slurmrestd(8)
> [Service]
> Type=simple
> EnvironmentFile=-/etc/default/slurmrestd
> # Default to local auth via socket
> #ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS unix:/run/slurmrestd.socket
> -a rest_auth/local -vvv
> # Uncomment to enable listening mode
> Environment="SLURM_JWT=daemon"
> ExecStart=/usr/sbin/slurmrestd $SLURMRESTD_OPTIONS 127.0.0.1:6820 -a
> rest_auth/jwt -vvv
> ExecReload=/bin/kill -HUP $MAINPID
> [Install]
> WantedBy=multi-user.target
> In logs, I have the following:
> Nov 26 10:28:05 backend systemd[1]: Started Slurm REST daemon.
> Nov 26 10:28:05 backend slurmrestd[17981]: debug2: _establish_config_source:
> using config_file=/etc/slurm/slurm.conf (default)
> Nov 26 10:28:05 backend slurmrestd[17981]: debug: slurm_conf_init: using
> config_file=/etc/slurm/slurm.conf
> Nov 26 10:28:05 backend slurmrestd[17981]: debug: Reading slurm.conf file:
> /etc/slurm/slurm.conf
> Nov 26 10:28:05 backend slurmrestd[17981]: debug: NodeNames=cluster setting
> Sockets=6 based on CPUs(6)/(CoresPerSocket(1)/ThreadsPerCore(1))
> Nov 26 10:28:05 backend slurmrestd[17981]: debug: Ignoring obsolete CacheGroups
> option.
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug: auth/jwt: init: JWT
> authentication plugin loaded
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug: parse_http:
> [[localhost]:44228] Accepted HTTP connection
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug: _on_url:
> [[localhost]:44228] url path: /slurmdb/v0.0.36/jobs query: (null)
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value:
> [[localhost]:44228] Header: Host Value: localhost:6820
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value:
> [[localhost]:44228] Header: User-Agent Value: curl/7.64.0
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value:
> [[localhost]:44228] Header: Accept Value: */*
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value:
> [[localhost]:44228] Header: X-SLURM-USER-NAME Value: slurm
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: debug2: _on_header_value:
> [[localhost]:44228] Header: X-SLURM-USER-TOKEN Value: e...sM
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: operations_router:
> [[localhost]:44228] GET /slurmdb/v0.0.36/jobs
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: accounting_storage/slurmdbd:
> init: Accounting storage SLURMDBD plugin loaded
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_persist_conn_open:
> Something happened with the receiving/processing of the persistent connection
> init message to localhost:6819: Failed to unpack SLURM_PERSIST_INIT message
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: Sending PersistInit msg:
> No error
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: g_slurm_auth_pack:
> protocol_version 6500 not supported
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_send_node_msg:
> g_slurm_auth_pack: REQUEST_PERSIST_INIT has authentication error: Operation
> now in progress
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: slurm_persist_conn_open:
> failed to send persistent connection init message to localhost:6819
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: Sending PersistInit msg:
> Protocol authentication error
> Nov 26 10:29:15 backend-lola slurmrestd[17981]: error: DBD_GET_JOBS_COND
> failure: Unspecified error
> I don't understand what I am missing. Can you help me ?
> Resource like /slurm/v0.0.36/jobs works well, but I need the full list of jobs
> and it's only provided by /slurmdb/v0.0.36/jobs
> Philippe N.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20220308/64a18d43/attachment.htm>
More information about the slurm-users
mailing list