[slurm-users] Slurm reservation for migrating user home directories

Ole Holm Nielsen Ole.H.Nielsen at fysik.dtu.dk
Tue Apr 27 06:58:56 UTC 2021


On 4/16/21 4:21 PM, Ole Holm Nielsen wrote:
> I'm thinking of a reservation something like this:
> 
> scontrol create reservation starttime=...  duration=12:00:00 
> ReservationName=migrate_physics nodes=ALL Accounts=-physics

For the record:  The idea of creating a Slurm reservation for excluding 
specified accounts from running jobs seems to be a viable one.  The 
question is being tracked in https://bugs.schedmd.com/show_bug.cgi?id=11404

The correct way to make such a reservation is actually to add several flags:

$ scontrol create reservation reservationname=exclude_account 
starttime=13:40:00 duration=30:00 flags=ignore_jobs,magnetic,flex 
nodes=ALL accounts=-sub1

Caveat: This will result in all Pending jobs getting an incorrect 
Reason=(ReqNodeNotAvail, Reserved for maintenance).  It seems that jobs 
from other accounts are starting correctly, however, so this does achieve 
the goal, but probably also causes confusion among users!

SchedMD is looking at a way to enhance a future Slurm version so that the 
incorrect Reason doesn't appear


>> On 16/04/2021 14.23, Ole Holm Nielsen wrote:
>>> I need to migrate several sets of user home directories from an old NFS 
>>> file server to a new NFS file server.  Each group of users belong to 
>>> specific Slurm accounts organized in a hierarchical tree.
>>>
>>> I want to make the migration while the cluster is in full production 
>>> mode for all the other accounts (the terms "service window" or 
>>> "downtime" don't exist for me :-)
>>>
>>> My idea is to make a Slurm reservation so that the accounts in question 
>>> will have zero jobs running during the reservation, and I also need to 
>>> kick users off the login nodes.  During the reservation I can rsync the 
>>> home directories from the old NFS server to the new NFS server and 
>>> update the NFS automounter links.
>>>
>>> Question:  Does anyone have experiences with this type of scenario? Any 
>>> good ideas or suggestions for other methods for data migration?

/Ole



More information about the slurm-users mailing list