[slurm-users] DBD_GET_ASSOCS failure
E.M. Dragowsky
dragowsky at case.edu
Mon Feb 26 10:57:57 MST 2018
Greetings --
My experience this morning includes the following (where there is about 10
min:
> [mrd20 at hpctest ~]$ sacctmgr show association where account=<userid>
> sacctmgr: error: slurmdbd: Getting response to message type 1410
> sacctmgr: error: slurmdbd: DBD_GET_ASSOCS failure: No error
> Error with request: No error
>
This seems to be a 'time-out' error, but I have no insight into why the
database would be unable to respond over the course of 10 minutes or so.
Otherwise, sacctmgr will return for "sacctmgr show account..." or other
type of queries. Only show associations seems affected.
Now, to the suspected cause:
Just prior to this, I had submitted an incorrect formulation of the
'sacctmgr delete user <usrid>', as follows, where the specific userid is
omitted:
> [root at hpc2 mrd20]# sacctmgr delete user where account=txl80
> Deleting user associations...
> C = hpctest A = <accid> U = <usrid1>
> C = hpctest A = <accid> U = <usrid2>
> ...
> ...
> ...
> C = hpctest A = <accid> U = <usrid16>
> C = hpctest A = <accid> U = <usrid17>
> Deleting users (No Associations)...
> <usrid1>
> <usrid2>
> User <usrid3> on cluster hpctest no longer has a default account.
> ...
> ...
> ...
> <usrid10>
> <usrid11>
>
this action was terminated by 'Ctrl-C'
Two noteworthy items
-- I had meant to operate on just one user association; and,
-- The command did not issue the expected prompt to verify that I wanted to
perform these deletions.
Since this occurred, the time-out with 'sacctmgr show assoc....' has
resolved, and it would seem that the associations were not impacted.
The most important questions to me remain: could the interrupted 'delete
user' have "hung" sacctmgr access to the slurmdb? And if so, what is going
on behind the scenes? I'd like to come away with better insight into how
the slurmdb operates. Pointers to slurmdb tutorials or slide decks are
welcome ;)
Best wishes, and thanks in advance
~ Em
--
E.M. Dragowsky, Ph.D.
Research Computing -- UTech
Case Western Reserve University
(216) 368-0082
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20180226/ec0581e0/attachment-0001.html>
More information about the slurm-users
mailing list