Remus

From Xen

Jump to: navigation, search



Remus provides transparent high availability to ordinary virtual machines running on Xen. It does this by continually live migrating a copy of a running VM to a backup server, which automatically activates if the primary server fails. Key features:

  • The backup VM is an exact copy of the primary VM (disk/memory/network). When failure occurs, the VM continues running on the backup host as if failure had never occurred.
  • The backup is completely up-to-date: even active TCP sessions are maintained without interruption.
  • Protection is transparent: existing guests can be protected without modifying them in any way.

Host (dom0) requirements

  • Xen hypervisor with remus support and tools (included with Xen 4.0+)
    • Note: Remus is not included with XCP, XenServer, or with some of the Linux pre-packaged versions of Xen, so please check your distribution or you may need to build Xen from source
  • Xen dom0 kernel that meets the Remus dom0 requirements
  • Shared storage is not required
  • DRBD shared storage is supported, allowing faster and automatic re synchronization after a failed host is brought back online
    • Otherwise to bring a failed node back online, the VM must be turned off

Guest (domU) requirements

Installation

Installation varies slightly depending upon the host platform, so please see the guides below for examples.

DRBD support

Using DRBD instead of blktap2 for storage replication allows for quick resynchronization of the disk backend after failed host is back online. Since storage (re)synchronization is done online - while the VM is operational, there is no need to shutdown the VM. Once storage is synchronized, one can start, stop and restart Remus on a running VM anytime.

However, DRBD must be custom built with support for protocol D (see the above install guides), so the normal packaged versions of DRBD are not suitable.

Note that DRBD will be operated in dual primary mode, which carries a number of risks and management issues. Please research this topic to be aware of the potential complications.

Limitations

  • For PV domUs, Remus requires "suspend event channel" kernel support. Otherwise Remus can run most any PV domU in a degraded performance mode. This kernel support is not widely available (not currently available in Ubuntu, for example), but is available with OpenSUSE. See Remus PV domU requirements for more information.

News

In Xen 4.0.0:

  • Xen hypervisor and tools have Remus support.
  • Only linux-2.6.18-xen is supported as Xen dom0 kernel with Remus.
  • If using a PV domU you need to run linux-2.6.18-xen as domU kernel.

In Xen 4.0.1:

  • Pvops dom0 kernel support for Remus has been added in Xen 4.0.1-rc4, so it's available in Xen 4.0.1 final release. You can use Linux 2.6.32 based pvops dom0 kernel with Remus.
  • PV domU kernel still needs to be linux-2.6.18-xen.

In Xen 4.2:

  • Many bugfixes to Remus.
  • Remus support for pvops domU kernels: Linux 2.6.39.2 and later upstream kernel.org versions are now supported as PV domU kernels, in addition to Jeremy's xen.git xen/stable-2.6.32.x branch.
  • For better Remus performance you should use a domU kernel with "suspend event channel" support, which means linux-2.6.18-xen, or any of the xenlinux forwardports (novell sles11sp1 2.6.32 kernel, for example). pvops domU kernels don't have suspend event channel support yet.
  • Checkpoint compression for less data to transfer between hosts.

Note that if using linux-2.6.18-xen kernel it needs to be new enough to include Remus support/patches! It's recommended to download the latest version from linux-2.6.18-xen.hg mercurial repository for use with Remus.

Links

Personal tools