<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
In addition, you can check why the node were set to drain with
`scontrol show node <your node name> | grep Reason`.<br>
The same information should also appear in the slurm controller logs
(e.g. /var/log/slurm/slurmctld.log).<br>
<br>
Colas<br>
<br>
<div class="moz-cite-prefix">On 2019-04-15 18:03, Andy Riebs wrote:<br>
</div>
<blockquote type="cite"
cite="mid:2414ec73-b556-3f9c-d3bc-13c7f48b10f6@hpe.com">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
The "invalid user id" message suggests that you need to be running
as root (or possibly as the slurm user?) to update the node state.<br>
<br>
Run "slurmd -Dvv" as root on one of the compute nodes and it will
show you what it thinks is the socket/core/thread configuration.<br>
<br>
<div id="smartTemplate4-quoteHeader">
<hr> <b>From:</b> Shihanjian Wang <a
class="moz-txt-link-rfc2396E" href="mailto:swang52@ucsc.edu"
moz-do-not-send="true"><swang52@ucsc.edu></a> <br>
<b>Sent:</b> Monday, April 15, 2019 5:30PM <br>
<b>To:</b> Slurm-users <a class="moz-txt-link-rfc2396E"
href="mailto:slurm-users@lists.schedmd.com"
moz-do-not-send="true"><slurm-users@lists.schedmd.com></a><br>
<b>Cc:</b> <br>
<b>Subject:</b> [slurm-users] Scontrol update: invalid user id <br>
</div>
<div class="replaced-blockquote"
cite="mid:CAO3y_0rtHy8L76rMMeB3Mf959k44SbRifxCharYMtr7Or5-Ccg@mail.gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=UTF-8">
<div dir="ltr">
<div>Hi,</div>
<div>We are doing a senior project involving the creation of a
Pi Cluster. We are using 7 Raspberry Pi B+'s in this
cluster. <br>
<br>
When we use sinfo to look at the status of the nodes, they
appear as drained. We also encountered a problem while
trying to update the state of the nodes. When trying to use
scontrol to update the nodes, the get an error message:
scontrol update: invalid user id. We think another reason
that the nodes are drained is because of low "resources".
This has to do with the low socket*core*thread count, which
is the number of CPUs. We have tried changing this number
in the configuration file but this reason still shows.<br>
<br>
</div>
<div>We are unsure what the problem is regarding this issue.
The authentication method used is munge, and we think that
slurm is indeed using munge as the authentication type.</div>
<div><br>
</div>
<div>If more information is needed, please let us know and we
will provide the required information.<br>
<br>
</div>
<div>Thanks.</div>
</div>
</div>
<br>
</blockquote>
<br>
</body>
</html>