[slurm-users] Slurm reservation for migrating user home directories
Ole Holm Nielsen
Ole.H.Nielsen at fysik.dtu.dk
Tue Apr 27 06:58:56 UTC 2021
On 4/16/21 4:21 PM, Ole Holm Nielsen wrote:
> I'm thinking of a reservation something like this:
>
> scontrol create reservation starttime=... duration=12:00:00
> ReservationName=migrate_physics nodes=ALL Accounts=-physics
For the record: The idea of creating a Slurm reservation for excluding
specified accounts from running jobs seems to be a viable one. The
question is being tracked in https://bugs.schedmd.com/show_bug.cgi?id=11404
The correct way to make such a reservation is actually to add several flags:
$ scontrol create reservation reservationname=exclude_account
starttime=13:40:00 duration=30:00 flags=ignore_jobs,magnetic,flex
nodes=ALL accounts=-sub1
Caveat: This will result in all Pending jobs getting an incorrect
Reason=(ReqNodeNotAvail, Reserved for maintenance). It seems that jobs
from other accounts are starting correctly, however, so this does achieve
the goal, but probably also causes confusion among users!
SchedMD is looking at a way to enhance a future Slurm version so that the
incorrect Reason doesn't appear
>> On 16/04/2021 14.23, Ole Holm Nielsen wrote:
>>> I need to migrate several sets of user home directories from an old NFS
>>> file server to a new NFS file server. Each group of users belong to
>>> specific Slurm accounts organized in a hierarchical tree.
>>>
>>> I want to make the migration while the cluster is in full production
>>> mode for all the other accounts (the terms "service window" or
>>> "downtime" don't exist for me :-)
>>>
>>> My idea is to make a Slurm reservation so that the accounts in question
>>> will have zero jobs running during the reservation, and I also need to
>>> kick users off the login nodes. During the reservation I can rsync the
>>> home directories from the old NFS server to the new NFS server and
>>> update the NFS automounter links.
>>>
>>> Question: Does anyone have experiences with this type of scenario? Any
>>> good ideas or suggestions for other methods for data migration?
/Ole
More information about the slurm-users
mailing list