Difference between revisions of "Tuning Xen for Performance"
|Line 111:||Line 111:|
* [[Xen_4.2:_cpupools|cpupools]]: using the command '''xl cpupool-numa-split''' (see
* [[Xen_4.2:_cpupools|cpupools]]: using the command '''xl cpupool-numa-split''' (see [[Cpupools_Howto#Using_cpupool-numa-split|here]]) you can split your physical cpus and memory into pools according to the NUMA topology. You'll end up with one cpupool per NUMA node: use '''xl cpupool-list''' to see the available cpupools. Then you can assign each VM to a different cpupool adding to the VM config file:
Revision as of 08:34, 21 February 2014
Tuning your Xen installation: recommended settings
LVM is the fastest storage backend available but it limits the options for live migration. See Storage_options for more details.
If the host has more memory than a typical laptop/desktop system, then do not rely on dom0 ballooning. Instead set the dom0 memory to be something between 1 and 4GB adding dom0_mem=1024M to the Xen command line.
The reason to not give all ram to dom0 is that it takes time to re-purpose it went domUs are started. On a host with 128GB Ram you're getting things stuck for minutes while dom0 is ballooning down to make space.
If the host has more then 16 cores, limiting the number of Dom0 VCPUs to 8 and pinning them can improve performances. You can do that by adding dom0_max_vcpus=8 dom0_vcpus_pin to the Xen command line. See: Can I dedicate a cpu core (or cores) only for dom0? .
Tuning your Xen installation: advanced settings
HAP vs. shadow
HAP stands for hardware assisted paging and requires a CPU feature called EPT by Intel and RVI by AMD. It is used to manage the guest's MMU. The alternative is shadow paging, completely managed in software by Xen. On HAP TLB misses are expensive so if you have really random access, HAP will be expensive. On shadow page table updates are expensive. HAP is enabled by default (and it is the recommended setting) but can be disabled passing hap=0 in the VM config file.
PV vs PV on HVM
Linux, NetBSD, FreeBSD and Solaris can run as PV or PV on HVM guests. Memory intensive workloads that involve the continuous creation and destruction of page tables can perform better when run in a PV on HVM guest. Examples are kernbench and sql-bench. On the other hand memory workloads that run on a quasi-static set of page tables run better on a PV guests. An example of this kind of workloads is specjbb. See Xen_Linux_PV_on_HVM_drivers#Performance_Tradeoffs for more details. A basic PV guest config file looks like the following:
bootloader = "/usr/bin/pygrub" memory = 1024 name = "linux" vif = [ "bridge=xenbr0" ] disk = [ "/root/images/debian_squeeze_amd64_standard.raw,raw,xvda,w" ] root = "/dev/xvda1"
You can also specify a kernel and ramdisk path in the dom0 filesystem directly in the VM config file, to be used for the guest:
kernel = "/boot/vmlinuz" ramdisk = "/boot/initrd" memory = 1024 name = "linux" vif = [ "bridge=xenbr0" ] disk = [ "/images/debian_squeeze_amd64_standard.raw,raw,xvda,w" ] root = "/dev/xvda1"
See this page for instructions on how to install a Debian PV guest.
HVM guests run in a fully emulated environment that looks like a normal PC from the inside. As a consequence an HVM config file is a bit different and cannot specify a kernel and a ramdisk. On the other hand it is possible to perform an HVM installation from an emulated cdrom, using the iso of your preferred distro. It is also possible to pxeboot the VM. See the following very basic example:
builder="hvm" memory=1024 name = "linuxhvm" vif = [ "type=ioemu, bridge=xenbr0" ] disk = [ "/images/debian_squeeze_amd64_standard.raw,raw,hda,w", "/images/debian-6.0.5-amd64-netinst.iso,raw,hdc:cdrom,r" ] serial="pty" boot = "dc"
See this page for a more detailed example PV on HVM config file.
You can dedicate a physical cpu to a particular virtual cpu or a set of virtual cpus. If you have enough physical cpus for all your guests, including dom0, you can make sure that the scheduler won't get in your way. Even if you don't have enough physical cpus for everybody, you can still use this technique to ensure that a particular guest has always cpu time.
xl vcpu-pin Domain-name1 0 0 xl vcpu-pin Domain-name1 1 1
These two commands pin vcpu 0 and 1 of Domain-name1 to physical cpu 0 and 1. However they do not prevent other vcpus from running on pcpu 0 and pcpu1: you need to plan in advance and pin the vcpus of all your guests so they won't be running on pcpu 0 and 1. For example:
xl vcpu-pin Domain-name2 all 2-6
This commands forces all the vcpus of Domain-name2 to only run on physical cpus from 2 to 6, leaving pcpu 0 and 1 to Domain-name1. You can also add the following lines to the config file of the VM to automatically pin the vcpus to a set of pcpus at boot time:
A NUMA machine is typically a multi-sockets machine built in such a way that processors have their own local memory. A group of processors connected to the same memory controller is usually called a node. Accessing memory from remote nodes is always possible, but it is usually very slow. Since VMs are usually small (both in number of vcpus and amount of memory) it should be possible to avoid remote memory access altogether. Both XenD and xl (starting from Xen 4.2) try to automatically make that happen by default. This means they will allocate the vcpus and memory of your VMs trying to take the NUMA topology of the underlying host into account, if no vcpu pinning or cpupools are specified (see right below). Check out this article for some more details.
However, if one wants to manually control from which node(s) the vcpus and the memory of a VM should come from, the following mechanisms are available:
- vcpu pinning: if you use the cpus setting in the VM config file (as described in the previous chapter) to assign all the vcpus of a VM to the pcpus of a single NUMA node, all the memory of the VM will be allocated locally to that node too: no remote memory access will occur (this is available in xl starting from Xen 4.2). To figure out which physical cpus belong to which NUMA node, you can use the following command:
xl info -n
- cpupools: using the command xl cpupool-numa-split (see here) you can split your physical cpus and memory into pools according to the NUMA topology. You'll end up with one cpupool per NUMA node: use xl cpupool-list to see the available cpupools. Then you can assign each VM to a different cpupool adding to the VM config file:
To find out more about NUMA within this Wiki, check out the various pages from the proper category: Category:NUMA