[slurm-users] slurm_update error: Invalid node state specified

Paul H. Hargrove phhargrove at lbl.gov
Tue Oct 11 16:22:57 UTC 2022


I think Rob is "on the right track" here.  Specifically, I don't think the
error message means that "RESUME" is unrecognized as the name of a state.
Rather the message means that a state transition from "INVAL" to "RESUME"
is invalid.  I can reproduce that message by trying to "RESUME" an "IDLE"
node, but "RESUME" works fine for node which has been revently rebooted.

-Paul


On Tue, Oct 11, 2022 at 8:14 AM Groner, Rob <rug262 at psu.edu> wrote:

> Have you checked the logs for slurmd and slurmctld?  I seem to recall that
> the "invalid" state for a node meant that there was some discrepancy
> between what the node says or thinks it has (slurmd -C) and what the
> slurm.conf says it has.  While there is that discrepancy and the node is
> invalid, you can't just tell it to resume.
>
> ------------------------------
> *From:* slurm-users <slurm-users-bounces at lists.schedmd.com> on behalf of
> Sushil Mishra <sushilbioinfo at gmail.com>
> *Sent:* Tuesday, October 11, 2022 10:08 AM
> *To:* Slurm User Community List <slurm-users at lists.schedmd.com>
> *Subject:* [slurm-users] slurm_update error: Invalid node state specified
>
> You don't often get email from sushilbioinfo at gmail.com. Learn why this is
> important <https://aka.ms/LearnAboutSenderIdentification>
> Dear all,
>
> I am stuck with scontrol not recognizing the state keywords. I wonder if
> someone can point me to the possible cause of the error.  I
> restarted slurmd a few times, and it didn't help.
>
> [sushil at fucose ~]$ sinfo
> PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
> LocalQ*      up   infinite      1  inval fucose
>
> [sushil at fucose ~]$ sinfo -R
> REASON               USER      TIMESTAMP           NODELIST
> cg                   sushil    2022-10-10T18:11:27 fucose
>
> [sushil at fucose ~]$ sudo scontrol update NodeName=fucose state=RESUME
> [sudo] password for sushil:
> slurm_update error: Invalid node state specified
>
> [sushil at fucose ~]$ squeue
>              JOBID PARTITION     NAME     USER ST       TIME  NODES
> NODELIST(REASON)
>
> Best,
> Sushil
>
>


-- 
Paul H. Hargrove <PHHargrove at lbl.gov>
Pronouns: he, him, his
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department
Lawrence Berkeley National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20221011/7daae5fc/attachment-0001.htm>


More information about the slurm-users mailing list