DeBaan: March 2010

Generic steps to P2V/ V2V/ V2P/ P2P/ clone Linux with dissimilar hardware and/or hypervisors

To handle cases where a simple right-click and clone or copy of vhd/vmdk won't suffice. From my blog at http://debaan.blogspot.com/ -Lane Bryson

Background

Purpose

This process will copy a running system (physical or virtual) to a running environment (physical or virtual). It is bandwidth efficient. It is not easy, but it works reliably with some caveats: new platforms, new rescue CD's, changes in the standard storage drivers, and so forth, may require additional work. A strong mid- or senior-level Linux administrator should be able to complete these steps.

Benefits

Depending on the applications/services on the source system, you don’t have to bring them down for this cloning operation.

Because this uses rsync rather than a disk-imaging tool: At best, you can run this process potentially with zero downtime; at worst, you can run this process (when you’re good at it) with downtime only slightly longer than the amount of time it takes to make a final 2nd or 3rd rsync (this depends on how much data changes and how quickly).

Because you configure partitioning yourself, you can take the opportunity to adjust sizes as desired. With both an imaging tool or this process, you’ll have to rebuild your initrd from a rescue disk if the source and target storage drivers, device names or paths (including logical volumes) have changed.

This process requires no additional tools such as ghost or Acronis.

Caveat Emptor

This process is derived from my own notes and experience. It works for me every time – probably close to 30-40 times so far. However, an understanding of the process is needed, as you’ll have to fill in the blanks and apply the process to your specific environment (specific distro, storage config, network, virtualization platform and environment, etc.). Test it out thoroughly on non-production systems first, until you are comfortable with it.

Things are constantly changing; there are many different distributions, hypervisors and countless hardware combinations. I will not update this for every case. Use this as a starting point and adjust as appropriate for your situation.

Use at your own risk! You may destroy things, and I will not stop you.

The trick here is that we rsync a running system. Provided the system is fairly quiet or has static data (like a print server), we won’t likely lose any data. Any data that changes on the source system from the time of the last rsync until the destination system is brought online (and the source system taken offline) will be lost – unless you stop the service or application on the source before the last rsync. If this is, e.g., a db server we’ll only have an outage while/if we stop the database service for the final rsync. If this is a static DNS or print server, and we don’t care about losing data in the logs, then we probably don’t need to stop anything, and can make the transition without anybody knowing it’s happened.

This would not be a suitable process for copying e.g. a system with TB’s of dynamic data on local storage – the rsync’s alone would take so long you’d probably quiet the system and take a scheduled outage.

If those caveats are acceptable to you, this process will be very reliable, if very manual. It could be scripted, but usually my source systems are varied quite a bit, and I take this opportunity to do things that require human intelligence like:

Re-evaluate RAM and filesystem size, cpu core count

convert from static partitions in the source to LVM-based filesystems in the destination

Source Platform Prep

If the target is a different platform type than the destination, and if you can take the possible performance hit (temporarily) or a required reboot, then uninstall any specialized hardware device drivers that are not absolutely required (e.g. raid monitoring, ATI catalyst drivers, VMware Tools, XenServer tools, etc). DO NOT uninstall essential network or storage drivers. While some of this can be done from within the chroot on the target, it is probably easier to do on the source.

Prepare the Target Platform

Hardware/Physical Target Platform

Configure the RAID, physical disks and BIOS boot order as desired. Details are beyond the scope of this document.

Xen Paravirtualized Target: a special case

Why so much work here? Because Linux paravirtualized domains run with extremely low overhead on XenServer. But our utility/rescue CD requires a fully virtualized (“HVM”) vm in which to run. And there’s not (to my knowledge) a simple way to switch a vm from “HVM” to paravirt.

If the target will be a paravirtualized Xen vm, then you’ll need to clone a pre-existing post-install one or manually set some flags; it's easier just to install CentOS as paravirtualized to get the correct settings, then just remove the disks and connect new ones in the next steps.

Create a HVM domain (from the "other OS" template), and attach the storage from the post-install paravirtualized domain. This HVM domain will be used to set up the disk and copy files from the source machine/vm.

Specifically:

Install a like-versioned (same distro and version as on the source platform) vm from the template. If you plan to re-use it because you’re doing a lot of P2V or V2V’s, then call it something like “Paravirt Post-Install” and clone it. In the end of this whole process, this is the vm that will ultimately replace the source system.
Remove the storage on the "Paravirt Post-install" VM and customize the VM to what you need, probably the same as in in the source. (Set the CPU count and architecture, RAM, priority, startup parameters, etc; you may set the MAC to be the same as your source Linux system, just don't power it on at the same time as the source!)
Create a HVM domain (from the "other OS" template), and create disk devices –these are the disk devices that will end up in the final target VM. This HVM domain will be used temporarily to set up the disk and copy files from the source machine/vm.

Now you have a paravirtualized domain ready to receive the finalized disks, and a HVM domain where you will do all the heavy lifting and prepare the disks.

VMware ESX VM Target Prep

Create a Vm according to the spec’s you need: cpu count and architecture, RAM, startup settings, disk devices, etc.

Recommend e1000 network adapter and LSI Logic parallel SCSI adapter.

Physical System Destination Prep

Configure the physical box as you require. The closer to the source configuration, the better. CPU architecture (32-bit or 64 bit) must be the same.

Target System Bringup

(note that all of these commands/operations will be performed in the target vm/physical platform.

Note for legacy source OS

If migrating a legacy (e.g., RHEL/CentOS 4.x or 3.x) Linux guest, where sata devices might be recognized as hda, hdb, etc., then boot with the systemrescuecd 0.4.3 (may have to use the alternative kernel, vmlinuz2), to ensure that the devices show up after boot with the correct (hdaX, not sdaX) device names.

Boot the LiveCD

Boot the Target platform from the LiveCD. I prefer to use the System RescueCD (downloadable from http://www.sysresccd.org/ ).

If source is running a 64-bit kernel, then at the boot prompt type “rescue64 docache”, else type “rescue docache”

Configure network on the target:

net-setup eth0

Partition the disk(s) in the target:

fdisk /dev/sda (or /dev/hda)

This is the opportunity to clean things up, divide filesystems, allocate more storage, etc.

Note: I generally convert things to use LVM, which will enable dynamic partition resizing down the road. In that case, make partition 1 on the first disk of type “Linux” (for /boot), 500 MB, make it active; create a partition to use the remainder of the disk(s) of type “Linux LVM”.

If your intent is to have the target be LVM-based:

pvcreate /dev/sda2

vgcreate VolGroup00 /dev/sda2

lvcreate --size 10G --name LV_root VolGroup00

Create/prepare other physical volumes, volume groups, and logical volumes as appropriate. (see the LVM documentation for help. This doc is not an LVM tutorial)

Format the filesystems in the target (use the appropriate filesystem type, label, and path):

regular filesystems … ….For example:

mkfs.ext3 -L /boot /dev/sda1

mkfs.ext3 -L / /dev/VolGroup00/LV_root

swap filesystem(s) … ….For example:

mkswap -L swap /dev/VolGroup00/LV_swap

**Note: If the source system is RHEL/CentOS 3.x or another equally old platform, then the livecd has probably created the filesystem with options that will render it incompatible with the OS that will eventually run on the target platform; you’ll have to use a “-I 128” in the mkfs.ext3 command, and then run this for each ext3 filesystem in order to remove the new ext3 attributes from the filesystem on the target:

tune2fs -O ^dir_index /dev/hda1

debugfs -w /dev/hda1 -R “features ^resize_inode”

Mount your target systems to a sandbox area, tmproot:

mkdir /mnt/tmproot

mount /dev/(root-device) /mnt/tmproot/

For example:

mount /dev/VolGroup00/LV_root /mnt/tmproot

mkdir /mnt/tmproot/boot

mount /dev/sda1 /mnt/tmproot/boot

(…mount all other filesystems relative to the tmproot sandbox.)

Perform an initial copy of the source to the target

(on System Rescue CD): Enter the Bash shell, because zsh on sysresccd will interpret”--" incorrectly.

/bin/bash

Make a first-pass copy of the source to the destination, where is replaced by the ip address of the source system.

rsync --archive --hard-links --numeric-ids --sparse --exclude=/proc/** --exclude=/sys/** --exclude=/selinux/** --exclude=/media/** --exclude=/tmp/** --exclude=/mnt/** :/ /mnt/tmproot

** Note: exclude any nfs-mounted paths (such as home), as well as any other paths you don’t want to copy over, in the command above, as these system paths have been excluded.

Optionally re-run the rsync command, just to get a more current consistent snapshot. (the second execution should be much faster.) At this point, if the source platform can tolerate downtime, you may stop all databases and other data-changing services/processes prior to running the sync, and leave it off; this will ensure that there is no data loss during the copy.

Set up your chrooted environment and boot devices, necessary for initrd config

mount -t proc none /mnt/tmproot/proc

mount -o bind /sys /mnt/tmproot/sys

cp /etc/mtab /mnt/tmproot/mtab

Verify that the device nodes for your boot device are in /mnt/tmproot/dev ; if not, copy them from /dev.

Add the LVM device nodes

chroot /mnt/tmproot /bin/bash

vgscan --mknodes

For RHEL3 / CentOS3 migration to PV (Here are the notes I took while I changed it over to the Paravirt(PV) kernel; this was on XenServer 5 or such; not sure if they still appl, but including for posterity):

Before installing the xs-tools, change your /etc/fstab and /etc/mtab to point to devices that start with "xv" (in other words, translate hda1 to xvda1. Cdrom drive is xvdd). Also, issue these: mknod /dev/xvda b 202 0 mknod /dev/xvda1 b 202 1 mknod /dev/xvda2 b 202 2 mknod /dev/xvda3 b 202 3 mknod /dev/xvda4 b 202 4 mknod /dev/xvc0 c 204 191 mknod /dev/xvdd b 202 48

Add “alias eth0 xen_net” to /etc/modules.conf

Now start install.sh on the xs-tools.iso.

It should have copied the new kernel and built a new initrd into your /boot. scp these over to the host. I placed mine in /opt/kernels/NameOfVM.

Shutdown the VM.

From the host's console, you'll need to change several parameters. xe vm-list (gives the list of all VM's.) xe vm-list uuid=VMUUIDFROMABOVECOMMAND params=all xe vm-param-set uuid=VMUUIDFROMABOVECOMMAND HVM-boot-policy="" xe vm-param-set uuid=VMUUIDFROMABOVECOMMAND PV-args="root=/dev/xvda2 xencons=xvc" xe vm-param-set uuid=VMUUIDFROMABOVECOMMAND PV-bootloader="" xe vm-param-set uuid=VMUUIDFROMABOVECOMMAND PV-kernel="/opt/kernels/RHEL3/vmlinuz-2.4.21-47.0.1.EL.xs5.5.0.42xenU" xe vm-param-set uuid=VMUUIDFROMABOVECOMMAND PV-ramdisk="/opt/kernels/RHEL3/initrd-2.4.21-47.0.1.EL.xs5.5.0.42xenU.img" (Also you need to change the VBD to "bootable".) xe vm-disk-list uuid=VMUUIDFROMABOVECOMMAND xe vbd-param-set uuid=VBD!!UUIDFROMABOVECOMMAND bootable=true (Reboot your VM from XenCenter and it should be good to go, with the exception of possible driver issues.)

exit (back out of changeroot)

Final target chroot fixup prior to initrd/kernel configuration; These are the things that most often result in a non-bootable system, if a detail is missed (e.g., "switchroot failed... panic"):

mount -o bind /dev /mnt/tmproot/dev

chroot /mnt/tmproot/ /bin/bash

Fixup /etc/fstab (particularly, device names, logical volume paths, filesystem labels, etc., need to be changed to match your target system’s config).

Fixup /etc/grub.conf (or wherever your grub.conf is located) (particularly, device names, logical volume paths, filesystem labels, etc., need to be changed to match your target system’s config)

On RHEL/CentOS source/target, fixup /etc/sysconfig/network-scripts/ifcfg-eth0 for new mac address (I comment out the HWADDR and MACADDR ; else when it comes up, the target will recognize that the NIC’s MAC has changed, and reconfigure/re-ip the interface. I presume you want the target to come up at the same IP as the source, and that the source will be powered off.

Configure the Target (chroot) to know to use the right scsi adapters

If this is an import to a paravirt Linux Xen domain, then add this to /etc/modprobe.conf:

alias eth0 xennet

alias scsi_hostadapter xenblk

else, add the appropriate scsi and eth0 drivers.

For vmware, by default:

alias eth0 e1000

alias scsi_hostadapter mptbase

alias scsi_hostadapter1 mptspi

or if this is an older RHEL system:

alias eth0 e1000

alias scsi_hostadapter mptbase

alias scsi_hostadapter1 mptscsih

…or whatever the appropriate driver is for your target (note: in generic HVM, such as RHEL3, ata_piix driver will be used)

Install the appropriate kernel for your target, if needed:

(If I recall correctly, this was true for CentOS 5.x, kernel 2.6.18, and earlier...) If the target is a Xen (Citrix or RHEL/CentOS Xen) paravirtualized vm but the source was not, you need to install the xen-optimized kernel:

yum install kernel-xen

If the target is a Vmware vm or physical host, but the source was a Xen-optimized kernel, you need to install the non-Xen kernel:

yum install kernel

Now, verify that /etc/grub (inside the chroot) is pointing at the correct kernel

Recreate the initrd for your target so that it will have the scsi/raid drivers, device nodes, and initial mounts it needs to boot:

For PVM Xen guest target:

mkinitrd –f –v --with=xenblk --force-scsi-probe --force-lvm-probe /boot/initrd-(kernel version) (kernel version)

For recent HVM Xen guest target:

mkinitrd –f –v --with=xenblk --force-scsi-probe --with=ata_piix --force-lvm-probe /boot/initrd-(kernel version) (kernel version)

For legacy (e.g., CentOS 3.x) HVM Xen guest target:

mkinitrd –f –v --force-scsi-probe --with=ata_piix --with=dm-mod /boot/initrd-(kernel version) (kernel version)

..or add “--with=” for your given scsi/raid module name. For VMware, these are probably “mptspi” and “mptbase”

For Ubuntu/Debian variants:

update-initramfs -c -v -k 2.6.32-24-server (specify the correct kernel version)

exit (back out of chroot)

Install the bootloader (grub) in the target platform

CentOS: grub-install --recheck --root-directory=/mnt/tmproot --no-floppy /dev/sda (use hda instead of sda if you had to boot an old rescue cd that uses IDE drivers instead of SCSI for SATA; use another device name as appropriate, e.g., for special RAID devices)

Final fixups and reboot target:

If you have not previously shut down the source application/platform, then you may want to quiesce applications/services on the source and run a final re-synce, this time syncing only the application data (e.g., /var/lib/mysql)). This will ensure no data loss with a minimal outage window (presuming the target system comes up cleanly.)

If you use selinux:

touch /mnt/tmproot/.autorelabel

To prevent a fsck upon reboot:

rm /mnt/tmproot/.autofsck

Unmount all filesystems in the sandbox

umount /mnt/tmproot/*

(unmount any other filesystems you have in the tmproot)

umount /mnt/tmproot/

shutdown -h

If the target is a Xen VM, then detach the hard disks from the HVM vm and re-attach it to the vm where this will live permanently.

If you’ve been complete, the target it should boot up.

You should be ready to kill the network on the source platform, unplug it, etc., once you’ve confirmed the target is up. ...namely, once the target gets to the point of starting services, this means that we're past the trickiest parts (initrd and switchroot), and I kill the network on the source, then bring it down gracefully. I unplug power an re-label it ("HOSTNAME-OLD" or some such) so that it doesn't accidentally get turned on.

Troublshooting the target if it panics on boot-up

Be sure to remove the “quiet” and “graphical” options from the kernel line in the grub.conf, or you may not be able to see what’s happening.

If grub doesn’t do anything, your bootloader is probably not installed correctly or points to an invalid/incorrect kernel and/or initrd. If you need to edit the grub file from within the xen host, run xe-edit-bootloader.

In grub.conf, be sure the kernel and initrd lines are correct, relative to the root of the filesystem on which /boot resides. Thus, if /dev/sda1 is /boot and /dev/sda2 is /, then your path will be (0,0)/kernel-…. But if /dev/sda1 is /, then the path will be (0,0)/boot/kernel-….

If you get a panic after switchroot, you either have a wrong value for / in /etc/fstab, in /etc/grub.conf, or you didn’t get the right scsi/raid driver in the initrd, you need to recreate the chroot environment and re-run the mkinitrd with the correct parameters. mkinitrd seems to look at /etc/mtab and/or /etc/fstab ; since this occurs in the chroot, you may have to edit those files in the chroot so that mkinitrd knows to pull in certain other drivers, mount paths or filesystem labels, etc.

Other Finalizing steps

Configure vm to autostart.

Load xen tools , vmware tools, or whatever is appropriate to the target.

Fix-up specific to certain OS's:

RHEL 4 HVM vm: if you get radical clock skew (ntpd won't even run as the clock slows and speeds up too much), then add "notsr" to the kernel boot line in /etc/grub.conf, and set the vm to use only 1 cpu.

Ubuntu 9.04 Jaunty on Xen - you may not have a functional console upon boot-up; you have to configure the vm to set up a login on the console. Normally, this is done in /etc/inittab, but Jaunty does not have an inttab. Instead, add to the bootloader "kernel" line the string "console=hvc0 xencons=tty" (in XenCenter, set the boot properties to that string); then create the file /etc/event.d/hvc0 with this:

start on stopped rc2 start on stopped rc3 start on stopped rc4 start on stopped rc5 stop on runlevel 0 stop on runlevel 1 stop on runlevel 6 respawn exec /sbin/getty 38400 hvc0

Conclusion

As you can see, it's not necessarily a quick and easy recipe; but it should let you see the main steps that need to happen. But you probably have just a few standardized types of source platforms and target platforms, so it should be very reproducible once you adapt this to your platforms.

Enjoy!

DeBaan

2010/03/17

GENERIC Linux P2V, V2V, P2P HowTo

Contributors

Labels

Blog Archive

Links