[slurm-users] Reservation with REPLACE_DOWN flag not replacing down nodes

Prentice Bisbal pbisbal at pppl.gov
Thu Jun 1 19:46:30 UTC 2023


I have this reservation which has the REPLACE_DOWN flag set:

ReservationName=test StartTime=2020-12-14T09:00:00 EndTime=2023-12-14T09:00:00 Duration=1095-00:00:00
    Nodes=traverse-k05g4 NodeCnt=1 CoreCnt=32 Features=(null) PartitionName=all Flags=IGNORE_JOBS,WEEKDAY,SPEC_NODES,REPLACE_DOWN,NO_HOLD_JOBS_AFTER_END
    TRES=cpu=128
    Users=(null) Groups=(null) Accounts=pppl,csi,pu,tromp,cses Licenses=(null) State=ACTIVE BurstBuffer=(null) Watts=n/a
    MaxStartDelay=(null)

Unfortunately, the one node in that reservation is down, and the 
reservation isn't being moved to another node:

# sinfo -n traverse-k05g4
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
all*         up 15-00:00:0      1  down* traverse-k05g4

I thought if I removed the node from the reservation, it would just get 
assigned to a different node, or if I removed the SPEC_NODES flag I 
could accomplish the same thing, but scontrol didn't like when I tried 
that:

# scontrol update reservationname=test nodes-=traverse-k05g4
scontrol: error: Reservation can't be updated with Nodes option; it is incompatible with REPLACE[_DOWN]
Error updating the reservation: Requested operation not supported on this system
slurm_update error: Requested operation not supported on this system

# scontrol update reservationname=test flags-=spec_nodes
scontrol: error: Error parsing flags -spec_nodes.  No reservation update.
slurm_update error: No error

Any ideas of what I'm doing wrong here, or what I can do to get this 
reservation assigned to nodes that are up? I'm trying to avoid deleting 
the entire reservation and create a new one, if possible.

-- 
Prentice
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.schedmd.com/pipermail/slurm-users/attachments/20230601/ec202ab0/attachment.htm>


More information about the slurm-users mailing list