<div dir="ltr"><p class="gmail-MsoPlainText" style="margin:0cm;font-size:11pt;font-family:Calibri,sans-serif">Thank you.. will try this and get back. Any other step being
missed here for migration?</p><p class="gmail-MsoPlainText" style="margin:0cm;font-size:11pt;font-family:Calibri,sans-serif"><br></p><p class="gmail-MsoPlainText" style="margin:0cm;font-size:11pt;font-family:Calibri,sans-serif">Thankyou,</p><p class="gmail-MsoPlainText" style="margin:0cm;font-size:11pt;font-family:Calibri,sans-serif"><br></p><p class="gmail-MsoPlainText" style="margin:0cm;font-size:11pt;font-family:Calibri,sans-serif">Purvesh </p></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 24 Apr 2023 at 12:08, Ole Holm Nielsen <<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk">Ole.H.Nielsen@fysik.dtu.dk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 4/24/23 08:09, Purvesh Parmar wrote:<br>
> thank you, however, because this is change in the data center, the names <br>
> of the servers contain datacenter names as well in its hostname and in <br>
> fqdn as well, hence i have to change both, hostnames as well as ip <br>
> addresses, compulsorily, to given hostnames as per new DC names.<br>
<br>
Could your data center be persuaded to introduce DNS CNAME aliases for the <br>
old names to point to the new DC names?<br>
<br>
If you're forced to use new DNS names only, then it's simple to change DNS <br>
names of compute nodes and partitions in slurm.conf:<br>
<br>
NodeName=...<br>
PartitionName=xxx Nodes=...<br>
<br>
as well as the slurmdb server name:<br>
<br>
AccountingStorageHost=...<br>
<br>
What I have never tried before is to change the DNS name of the slurmctld <br>
host:<br>
<br>
ControlMachine=...<br>
<br>
The critical aspect here is that you need to stop all batch jobs, plus <br>
slurmdbd and slurmctld. Then you can backup (tar-ball) and transfer the <br>
Slurm state directories:<br>
<br>
StateSaveLocation=/var/spool/slurmctld<br>
<br>
However, I don't know if the name of the ControlMachine is hard-coded in <br>
the StateSaveLocation files?<br>
<br>
I strongly suggest that you try to make a test migration of the cluster to <br>
the new DC to find out if it works or not. Then you can always make <br>
multiple attempts without breaking anything.<br>
<br>
Best regards,<br>
Ole<br>
<br>
<br>
> On Mon, 24 Apr 2023 at 11:25, Ole Holm Nielsen <<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a> <br>
> <mailto:<a href="mailto:Ole.H.Nielsen@fysik.dtu.dk" target="_blank">Ole.H.Nielsen@fysik.dtu.dk</a>>> wrote:<br>
> <br>
> On 4/24/23 06:58, Purvesh Parmar wrote:<br>
> > thank you, but its change of hostnames as well, apart from ip<br>
> addresses<br>
> > as well of the slurm server, database serverver name and slurmd<br>
> compute<br>
> > nodes as well.<br>
> <br>
> I suggest that you talk to your networking people and request that the<br>
> old<br>
> DNS names be created in the new network's DNS for your Slurm cluster.<br>
> Then Ryan's solution will work. Changing DNS names is a very simple<br>
> matter!<br>
> <br>
> My 2 cents,<br>
> Ole<br>
> <br>
> <br>
> > On Mon, 24 Apr 2023 at 10:04, Ryan Novosielski<br>
> <<a href="mailto:novosirj@rutgers.edu" target="_blank">novosirj@rutgers.edu</a> <mailto:<a href="mailto:novosirj@rutgers.edu" target="_blank">novosirj@rutgers.edu</a>><br>
> > <mailto:<a href="mailto:novosirj@rutgers.edu" target="_blank">novosirj@rutgers.edu</a> <mailto:<a href="mailto:novosirj@rutgers.edu" target="_blank">novosirj@rutgers.edu</a>>>> wrote:<br>
> ><br>
> > I think it’s easier than all of this. Are you actually changing<br>
> names<br>
> > of all of these things, or just IP addresses? It they all<br>
> resolve to<br>
> > an IP now and you can bring everything down and change the<br>
> hosts files<br>
> > or DNS, it seems to me that if the names aren’t changing,<br>
> that’s that.<br>
> > I know that “scontrol show cluster” will show the wrong IP<br>
> address but<br>
> > I think that updates itself.<br>
> ><br>
> > The names of the servers are in slurm.conf, but again, if the names<br>
> > don’t change, that won’t matter. If you have IPs there, you<br>
> will need<br>
> > to change them.<br>
> ><br>
> > Sent from my iPhone<br>
> ><br>
> > > On Apr 23, 2023, at 14:01, Purvesh Parmar<br>
> <<a href="mailto:purveshp0507@gmail.com" target="_blank">purveshp0507@gmail.com</a> <mailto:<a href="mailto:purveshp0507@gmail.com" target="_blank">purveshp0507@gmail.com</a>><br>
> > <mailto:<a href="mailto:purveshp0507@gmail.com" target="_blank">purveshp0507@gmail.com</a><br>
> <mailto:<a href="mailto:purveshp0507@gmail.com" target="_blank">purveshp0507@gmail.com</a>>>> wrote:<br>
> > > <br>
> > > Hello,<br>
> > ><br>
> > > We have slurm 21.08 on ubuntu 20. We have a cluster of 8 nodes.<br>
> > Entire slurm communication happens over 192.168.5.x network (LAN).<br>
> > However as per requirement, now we are migrating the cluster to<br>
> other<br>
> > premises and there we have 172.16.1.x (LAN). I have to migrate the<br>
> > entire network including SLURMDBD (mariadb), SLURMCTLD, SLURMD.<br>
> ALso<br>
> > the cluster network is also changing from 192.168.5.x to 172.16.1.x<br>
> > and each node will be assigned the ip address from the 172.16.1.x<br>
> > network.<br>
> > > The cluster has been running for the last 3 months and it is<br>
> > required to maintain the old usage stats as well.<br>
> > ><br>
> > ><br>
> > > Is the procedure correct as below :<br>
> > ><br>
> > > 1) Stop slurm<br>
> > > 2) suspend all the queued jobs<br>
> > > 3) backup slurm database<br>
> > > 4) change the slurm & munge configuration i.e. munge conf,<br>
> mariadb<br>
> > conf, slurmdbd.conf, slurmctld.conf, slurmd.conf (on compute<br>
> nodes),<br>
> > gres.conf, service file<br>
> > > 5) Later, do the update in the slurm database by executing below<br>
> > command<br>
> > > sacctmgr modify node where node=old_name set name=new_name<br>
> > > for all the nodes.<br>
> > > ALso, I think, slurm server name and slurmdbd server names<br>
> are also<br>
> > required to be updated. How to do it, still checking<br>
> > > 6) Finally, start slurmdbd, slurmctld on server and slurmd on<br>
> > compute nodes<br>
> > ><br>
> > > Please help and guide for above.<br>
> > ><br>
> > > Regards,<br>
> > ><br>
> > > Purvesh Parmar<br>
> > > INHAIT<br>
> <br>
<br>
<br>
</blockquote></div>