Archived/Xen 4.3 RC6 test instructions

From Xen

Jump to: navigation, search
File:Icon_Info.png Specific functionality to test in RC6 :
  • MMIO hole relocation workaround: We've had to change the way we handle some aspects of PCI passthrough to work around an issue with qemu-xen. We think we've got everything right, but please test your own configuration to make sure that it still works for you. We particularly need graphics cards with large amounts of video RAM tested.
  • CPU hotplug: until now, CPU hotplug support was missing in qemu-xen. It has been implemented. Thorough testing of the CPU hotplug use cases is therefore desirable. CPU unplug is not supported.

More RC6 stuff that needs testing ...


Contents

What needs to be tested

General things:

  • Making sure that Xen 4.3 compiles and installs properly on different software configurations; particularly on distros
  • Making sure that Xen 4.3, along with appropriately up-to-date kernels, work on different hardware.

Changes/fixes from RC5

  • MMIO hole relocation (for PCI Passthrough) fix for qemu-xen
  • CPU hotplug with qemu-xen

Changes/fixes from RC4

  • x86/HVM APIC_ID emulation fix
  • x86/PVHVM wallclock fix
  • XSA55 fixes

Changes/fixes from RC3 to RC4

  • x86/vtsc fix
  • x86/xsave fixes
  • VPMU workaround enabling
  • fixes related to IOMMU operations and IRQ handling in x86
  • xl cd-insert fix

Specific features:

  • Automatic NUMA placement of guests.
  • Upstream Qemu for HVM domains
  • Openvswitch integration: see Setting Up OpenvSwitch Networking for more information
  • Xen on ARM
  • Windows 2003 support
  • others?

Older Xen Features where we are not sure how much test coverage these got (andf are thuis marked experimentral):

For more ideas about what to test, please see Testing Xen.

Installing

Getting RC6

  • xen: with a recent enough git (>= 1.7.8.2) just pull from the proper tag (4.3.0-rc6) from the main repo directly:
git clone -b 4.3.0-rc6 git://xenbits.xen.org/xen.git

With an older git version (and/or if that does not work, e.g., complaining with a message like this: Remote branch 4.3.0-rc6 not found in upstream origin, using HEAD instead), do the following:

git clone git://xenbits.xen.org/xen.git ; cd xen ; git checkout 4.3.0-rc6

Building

Instructions are available for building Xen on Linux and NetBSD.

Test instructions

General

  • Remove any old versions of Xen toolstack and userspace binaries (including qemu).
  • Download and install the most recent Xen 4.3 RC, as described above. Make sure to check the README for changes in required development libraries and procedures. Some particular things to note:
    • In Xen 4.3 the default installation path has changed from /usr to /usr/local. Take extra care when removing any old versions to allow for this.

Once you have Xen 4.3 RC installed check that you can install a guest etc and use it in the ways which you normally would, i.e. that your existing guest configurations, scripts etc still work.

In particular if you are still using the (deprecated) xm/XEND toolstack please do try your normal use cases with the XL toolstack. The XL page has some information on the differences between XEND and XL. As do the instructions from the Xen 4.2 test day.

Specific RC6 things

MMIO hole relocation workaround: We've had to change the way we handle some aspects of PCI passthrough to work around an issue with qemu-xen. We think we've got everything right, but please test your own configuration to make sure that it still works for you. We particularly need graphics cards with large amounts of video RAM tested.

CPU hotplug: until now, CPU hotplug support was missing in qemu-xen. It has been implemented. Thorough testing of the CPU hotplug use cases is therefore desirable. CPU unplug is not supported.

Specific RC5 things

x86/HVM APIC_ID emulation fix and x86/PVHVM wallclock fix: exercising the suspend/resume/migrate paths for (PVon)HVM guests would be desirable.

XSA55 fixes: it was the ELF parser that was vulnerable and has been patched, so feel free to try build a domain using each and every PV kernel image you have handy.

Specific RC4 things

These are fixes introduced before RC4, but feel free to test or re-test them and report what you find.

x86/vtsc fix: migrating an PVHVM guest resulted in some issues with the timers in the guest itself. That should have been fixed in -rc4. For confirming this, just play with local and remote migration of PVHVM guests and look for any timing related issues and/or warning messages.

x86/xsave fixes: some of the FPU emulation code has been changed; running anything that is math and/or graphics intensive (e.g., benchmarks), in particularl in 32-bit HVM guests and in 32-bit mode in a 64-bit guest would be ideal to exercise and verify the new code is working properly.

VPMU workaround enabling: ideally, anyone using an Intel family 6 processor should test VPMU, as we have now enabled the workaround that was previously triggering only for a few models, for the whole CPU family. Some examples of CPU microarchitecture names of chips belonging to that family are IvyBridge, SandyBridge, Westmere, Nehalem, and others (complete list here)

IOMMU and IRQ on in x86: just check whether, during any test you run, you see any error or warning message related to these things. The quickest and easiest way to make sure you are stressing these code paths is to try out passing through to your guests some PCI devices (and actually using them from within it).

xl cd-insert: you may want to stress the cd-insert and cd-eject. In particular, trying to use an empty file with cd-insert should now return an error.

Specific Test Instructions

Automatic NUMA placement of guests

In Xen 4.3 you will find both automatic NUMA placement of guests at creation time (which was already in Xen 4.2), and NUMA aware scheduling (for the Credit Scheduler). The former is about allocating the memory of a VM from the least possible number of NUMA node, the latter is about scheduling the VM's vCPUs on those NUME nodes without any need of statically pinning them there.

File:Icon_Info.png A NUMA host is required to test these features. numactl --hardware, on bare metal, and/or xl info -n, on Dom0, can tell you if you have one


NUMA placement

  • create a config file for a VM without any cpus = "..." option in it
  • create the VM with xl -vvv create vm1.cfg and check the output for something like the below:
   libxl: debug: libxl_numa.c:475:libxl__get_numa_candidate: New best NUMA placement candidate found: nr_nodes=1, nr_cpus=8, nr_vcpus=18, free_memkb=1060
   libxl: detail: libxl_dom.c:195:numa_place_domain: NUMA placement candidate with 1 nodes, 8 cpus and 1060 KB free selected
  • check that the memory has actually been allocated on (in the case above) just one node by running xl debug-key u | xl dmesg | tail (yes, a more easy and convenient method for figuring this out is planned!):
   # xl debug-keys u | xl dmesg | tail
   (XEN) CPU13 -> NODE1
   (XEN) CPU14 -> NODE1
   (XEN) CPU15 -> NODE1
   (XEN) Memory location of each domain:
   (XEN) Domain 0 (total: 2777077):
   (XEN)     Node 0: 1271001
   (XEN)     Node 1: 1506076
   (XEN) Domain 2 (total: 245760):
   (XEN)     Node 0: 245760
   (XEN)     Node 1: 0
  • use xl info -n and xl vcpu-list to verify that the NUMA node affinity is what is being used (instead of the vCPU affinity, as it was for Xen 4.2):
   # xl list -n
   Name                                        ID   Mem VCPUs      State   Time(s) NODE Affinity
   Domain-0                                     0 10847    16     r-----     106.3 any node
   vm1                                          2   960     2     -b----      10.0 0
   # xl vcpu-list vm1
   Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
   vm1                                  2     0    5   -b-       5.3  any cpu
   vm1                                  2     1    0   -b-       4.8  any cpu

NUMA aware scheduling

The idea here is trying to make sure that the vCPUs of a VM run as much as possible on the pCPUs of the NODE(s) the VM has node affinity with. This is something non trivial to analyze and verify. Some hints and examples on how that can be done are available in this blog post. An easy, but unreliable way to at least have an idea if something is going wrong follows:

  • create a VM, as above, without specifying any pinning in the config file:
   # xl create /etc/xen/vm1.cfg
   Parsing config from /etc/xen/vm1.cfg
   Daemon running with PID 4632
   # ping 192.168.0.221
   PING 192.168.0.221 (192.168.0.221) 56(84) bytes of data.
   64 bytes from 192.168.0.221: icmp_req=1 ttl=64 time=0.620 ms
   ...
  • make sure the vCPUs are busy:
   $ ssh root@192.168.0.221 "yes &> /dev/null &"
   $ ssh root@192.168.0.221 "yes &> /dev/null &"
   $ ssh root@192.168.0.221 "yes &> /dev/null &"
   $ ssh root@192.168.0.221 "yes &> /dev/null &"
   $ ssh root@192.168.0.221 "ps aux|grep yes"
   root      1749 51.0  0.0   5596   608 ?        R    11:11   1:43 yes
   root      1753 49.7  0.0   5596   608 ?        R    11:11   1:38 yes
   root      1757 49.4  0.0   5596   608 ?        R    11:11   1:37 yes
   root      1761 49.4  0.0   5596   608 ?        R    11:11   1:36 yes
  • use xl vcpu-list vm1 many times, and ideally under different load conditions (e.g., other busy VMs), and verify that the vCPUs are actually scheduled on the pCPUs (CPU column) of the proper NUMA node:
   # xl vcpu-list vm1
   Name                                ID  VCPU   CPU State   Time(s) CPU Affinity
   vm1                                  3     0    1   r--     444.9  any cpu
   vm1                                  3     1    4   r--     419.0  any cpu

Stress testing

NUMA node affinity (on a per-domain basis, for now) and vCPU pinning (on a per-vCPU basis) can coexist. Semantic is that pinning should always prevail on NODE affinity. A sensible way to verify it is creating a VM with a NODE affinity (as explained above) and change on-line the vCPU pinning of its vCPUs, checking, again, with xl list -n and xl vcpu-list that the results is what one would have expected.

Upstream Qemu for HVM domains

In Xen 4.3 we have switched to using upstream qemu (which xl calls "qemu-xen") to provide the device model when running HVM guests instead of the older Xen fork of Qemu (which xl calls "qemu-xen-traditional). Interesting things to test in this context:

  • Does the new device model support the guest OSes which you use, can you install as you would have with the old device model?
  • Do features behave as expected, e.g. migration, VNC console.
  • Do previously installed HVM guests, installed with qemu-xen-traditional, work when switched to qemu-xen?
    • It is expected that some guest types will not like the change in hardware which this entails. In this case is setting device_model_version="qemu-xen-traditional" in the guest configuration sufficient to make the guest OS happy again?
    • If the guest doesn't seem to mind this change then this is useful information, please report it to us.
    • Note: The new device model does not yet support stubdomains and so the default is unchanged if you request stubdomains.
  • Does the old device model still work if you set device_model_version="qemu-xen-traditional" in the guest configuration?
  • Do new features enabled by the new device model, such as SPICE graphics, work?

Openvswitch integration

Xen 4.3 adds support for Open vSwitch based networking in addition to the existing bridge and routed networking schemes.

In order to test this you will need to setup a host with openvswitch support. Information on this is available at http://openvswitch.org/support/. In summary you need to:

  • Install a domain 0 kernel with CONFIG_OPENVSWITCH enabled (any recent PVOPS kernel should have this option).
  • Install the Open vSwitch userspace, see http://openvswitch.org/download/.
  • Configure the host networking to use Open vSwitch instead of bridge.

e.g. to create a switch (which we will call xenbr0 to simplify the transition) and add eth0 as a physical port:

# ovs-vsctl add-br xenbr0
# ovs-vsctl add-port xenbr0 eth0

These appear to be persistent reboot, so only need to be done once. Now you should arrange to add an IP address to xenbr0, e.g. under Debian create an entry in /etc/network/interfaces:

auto xenbr0
iface xenbr0 inet dhcp

(remember to remove/comment any bridge related items like bridge_ports eth0)

Once the host is configured you need to configure the system to use vswitch for guests. You can do this by editing /etc/xen/xl.conf and setting:

vifscript=vif-openvswitch

Now you can try starting your guests and performing the usual operations on them (e.g. reboot, migrate etc) and verify that the network is accessible to the guest.

Reporting Bugs (& Issues)

  • Report any bugs / missing functionality / unexpected results.
  • Please put [TestDay] into the subject line
  • Also make sure you specify the RC number you are using
  • Make sure to follow the guidelines on Reporting Bugs against Xen.

Reporting success

We would love it if you could report successes by e-mailing xen-devel@lists.xen.org, preferably including:

  • Hardware: Please at least include the processor manufacturer (Intel/AMD). Other helpful information might include specific processor models, amount of memory, number of cores, and so on
  • Software: If you're using a distro, the distro name and version would be the most helpful. Other helpful information might include the kernel that you're running, or other virtualization-related software you're using (e.g., libvirt, xen-tools, drbd, &c).
  • Guest operating systems: If running a Linux version, please specify whether you ran it in PV or HVM mode.
  • Functionality tested: High-level would include toolstacks, and major functionality (e.g., suspend/resume, migration, pass-through, stubdomains, &c)

The following template might be helpful: should you use Xen 4.3.0-RC6 for testing, please make sure you state that information!

Subject: [TESTDAY] Test report
 
* Hardware:
 
* Software:

* Guest operating systems:

* Functionality tested:

* Comments:

For example:

Subject: [TESTDAY] Test report
 
* Hardware: 
Dell 390's (Intel, dual-core) x15
HP (AMD, quad-core) x5
 
* Software: 
Ubuntu 10.10,11.10
Fedora 17

* Guest operating systems:
Windows 8
Ubuntu 12.10,11.10 (HVM)
Fedora 17 (PV)

* Functionality tested:
xl
suspend/resume
pygrub

* Comments:
Window 8 booting seemed a little slower than normal.

Other than that, great work!
Personal tools