Difference between revisions of "Tuning Xen for Performance"
|Line 61:||Line 61:|
name = "linuxhvm"
name = "linuxhvm"
vif = [
vif = [ type=ioemu, bridge=xenbr0]
disk = [
disk = [ /images/debian_squeeze_amd64_standard.raw,raw,hda,w, /images/debian-6.0.5-amd64-netinst.iso,raw,hdc:cdrom,r]
boot = dc
Revision as of 15:00, 30 July 2012
Tuning your Xen installation: recommended settings
LVM is the fastest storage backend available but it limits the options for live migration. See Storage_options for more details.
If the host has more memory than a typical laptop/desktop system, then do not rely on dom0 ballooning. Instead set the dom0 memory to be something between 1 and 4GB adding dom0_mem=1024M to the Xen command line.
1GB is enough for a pretty large host, more will be needed if you expect your users to use advanced storage types as ZFS or distributed filesystems.
The reason to not give all ram to dom0 is that it takes time to re-purpose it went domUs are started. On a host with 128GB Ram you're getting things stuck for minutes while dom0 is ballooning down to make space.
If the host has more then 16 cores, limiting the number of Dom0 VCPUs to 8 and pinning them can improve performances. You can do that by adding dom0_max_vcpus=8 dom0_vcpus_pin to the Xen command line. See: Can I dedicate a cpu core (or cores) only for dom0? .
Tuning your Xen installation: advanced settings
HAP vs. shadow
HAP stands for hardware assisted paging and requires a CPU feature called EPT by Intel and RVI by AMD. It is used to manage the guest's MMU. The alternative is shadow paging, completely managed in software by Xen. On HAP TLB misses are expensive so if you have really random access, HAP will be expensive. On shadow page table updates are expensive. HAP is enabled by default (and it is the recommended setting) but can be disabled passing hap=0 in the VM config file.
PV vs PV on HVM
Linux, NetBSD, FreeBSD and Solaris can run as PV or PV on HVM guests. Memory intensive workloads that involve the continuous creation and destruction of page tables can perform better when run in a PV on HVM guest. Examples are kernbench and sql-bench. On the other hand memory workloads that run on a quasi-static set of page tables run better on a PV guests. An example of this kind of workloads is specjbb. See Xen_Linux_PV_on_HVM_drivers#Performance_Tradeoffs for more details. A basic PV guest config file looks like the following:
bootloader = "/usr/bin/pygrub" memory = 1024 name = "linux" vif = [ "bridge=xenbr0" ] disk = [ "/root/images/debian_squeeze_amd64_standard.raw,raw,xvda,w" ] root = "/dev/xvda1"
You can also specify a kernel and ramdisk path in the dom0 filesystem directly in the VM config file, to be used for the guest:
kernel = "/boot/vmlinuz" ramdisk = "/boot/initrd" memory = 1024 name = "linux" vif = [ "bridge=xenbr0" ] disk = [ "/images/debian_squeeze_amd64_standard.raw,raw,xvda,w" ] root = "/dev/xvda1"
See this page for instructions on how to install a Debian PV guest.
HVM guests run in a fully emulated environment that looks like a normal PC from the inside. As a consequence an HVM config file is a bit different and cannot specify a kernel and a ramdisk. On the other hand it is possible to perform an HVM installation from an emulated cdrom, using the iso of your preferred distro. It is also possible to pxeboot the VM. See the following very basic example:
builder="hvm" memory=1024 name = "linuxhvm" vif = [ "type=ioemu, bridge=xenbr0" ] disk = [ "/images/debian_squeeze_amd64_standard.raw,raw,hda,w", "/images/debian-6.0.5-amd64-netinst.iso,raw,hdc:cdrom,r" ] serial="pty" boot = "dc"
See this page for a more detailed example PV on HVM config file.
You can dedicate a physical cpu to a particular virtual cpu or a set of virtual cpus. If you have enough physical cpus for all your guests, including dom0, you can make sure that the scheduler won't get in your way. Even if you don't have enough physical cpus for everybody, you can still use this technique to ensure that a particular guest has always cpu time.
xl vcpu-pin Domain-name1 0 0 xl vcpu-pin Domain-name1 1 1
These two commands pin vcpu 0 and 1 of Domain-name1 to physical cpu 0 and 1. However they do not prevent other vcpus from running on pcpu 0 and pcpu1: you need to plan in advance and pin the vcpus of all your guests so they won't be running on pcpu 0 and 1. For example:
xl vcpu-pin Domain-name2 all 2-6
This commands forces all the vcpus of Domain-name2 to only run on physical cpus from 2 to 6, leaving pcpu 0 and 1 to Domain-name1. You can also add the following lines to the config file of the VM to automatically pin the vcpus to a set of pcpus at boot time:
A NUMA machine is typically a multi-sockets machine built in such a way that each processor has its own local memory. Accessing memory of other nodes is possible but slow. Read this article to know more about NUMA. Usuall VMs are smaller than a single NUMA node so it should be possible to avoid remote memory access altogether, using one of the following techniques:
- the default automatic placement: xl will try to allocate the vcpus and memory of your VMs according to the NUMA topology of your machine by default.
- vcpu pinning: if you use the cpus setting in the VM config file (as described in the previous chapter) to assign all the vcpus of a VM to the pcpus of a single NUMA node, all the memory of the VM will be allocated locally to that node too: no remote memory access will occur. You can use the command xl info -n to figure out which physical cpus belong to which NUMA node.
- cpupools: using the command xl cpupool-numa-split you can split your physical cpus and memory into pools according to the NUMA topology. You'll end up with one cpupool per NUMA node: use xl cpupool-list to see the available cpupools. Then you can assign each VM to a different cpupool adding to the VM config file: