Cpupools Howto

From Xen

Cpupools is a introduced in Xen 4.2 which allows you to divide your physical cpus into distinct groups called "cpupools". Each pool can have its entirely separate scheduler. Domains are assigned to pools on creation, and can be moved from one pool to another.

This HOWTO will cover the basics of how to work with pools. Reference information can be found in the XL man pages under the heading CPUPOOLS COMMANDS. See 4.5 man pages and Xen unstable man pages for your convenience.

Setting up and modifying a pool

On boot, a "default pool" named Pool-0 will be created. You can see this pool as follows:

# xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0               8    credit       y          1

This shows Pool-0, with 8 cpus, and the credit scheduler, and one active domain -- domain 0. To see which cpus are in the pool, use the "-c" option:

# xl cpupool-list -c
Name               CPU list
Pool-0             0,1,2,3,4,5,6,7

To set up a new empty pool, you can use the command as follows:

# xl cpupool-create name=\"testing\"
Using config file "command line"
cpupool name:   testing
scheduler:      credit
number of cpus: 0

(Please note the \ before the quotes.) The new pool has been set up using the default scheduler, the credit scheduler. To set it up using a different scheduler, you can specify it using sched=:

# xl cpupool-create name=\"testing\" sched=\"credit2\"
Using config file "command line"
cpupool name:   testing
scheduler:      credit2
number of cpus: 0

Now we can remove some cpus from Pool-0, and add them to the new pool:

# xl cpupool-list -c
Name               CPU list
Pool-0             0,1,2,3,4,5,6,7
testing            
# xl cpupool-cpu-remove Pool-0 4
# xl cpupool-cpu-remove Pool-0 5
# xl cpupool-cpu-remove Pool-0 6
# xl cpupool-cpu-remove Pool-0 7
# xl cpupool-cpu-add testing 4
# xl cpupool-cpu-add testing 5
# xl cpupool-cpu-add testing 6
# xl cpupool-cpu-add testing 7
# xl cpupool-list -c
Name               CPU list
Pool-0             0,1,2,3
testing            4,5,6,7

Since cpus 4-7 are part of NUMA node 1, we could have accomplished the same thing in this way:

# xl cpupool-cpu-remove Pool-0 node:1
# xl cpupool-cpu-add testing node:1
# xl cpupool-list -c
Name               CPU list
Pool-0             0,1,2,3
testing            4,5,6,7

Assigning VMs to pools and moving them

When you create a VM, it must be assigned to a pool. By default, it will be assigned to the default pool (Pool-0), but you can specify this either by adding the following line in config file:

pool="testing"

Or by specifying it on the command line:

# xl create u0 pool=\"testing\"

A domain assigned to a pool can only run on cpus within that pool.

To move a VM to a different pool:

# xl cpupool-migrate u0 Pool-0

Note that Domain-0 can't be moved to another pool.

Listing and modifying the parameters of a pool scheduler

To list the parameters of a pool scheduler, first find out what kind of scheduler it is:

# xl cpupool-list 
Name               CPUs   Sched     Active   Domain count
Pool-node0           4    credit       y          2
Pool-node1           4    credit       y          0

In this case, both are using the credit scheduler, so xl sched-credit is the command we want. To read the parameters of the particular node, we use the -p option to specify the pool, and the -s option to only display information about the scheduler (not about domains):

# xl sched-credit -p Pool-node1 -s
Cpupool Pool-node0: tslice=30ms ratelimit=1000us

To change one of the parameters:

# xl sched-credit -p Pool-node1 -s -t 5ms
# xl sched-credit -p Pool-node1 -s
Cpupool Pool-node0: tslice=5ms ratelimit=1000us

Note that because domain names are still unique across the system, that there is no need to specify a cpupool when modifying a VM's scheduling parameters:

# xl sched-credit -d u0 -w 512
# xl sched-credit
Cpupool Pool-node0: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
Domain-0                             0    256    0
Cpupool Pool-node1: tslice=5ms ratelimit=1000us
Name                                ID Weight  Cap
u0                                   2    512    0

However, do note that when moving a VM from one pool to another, you are also moving it from one scheduler to another, and its scheduling parameters are not preserved:

# xl sched-credit
Cpupool Pool-node0: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
Domain-0                             0    256    0
Cpupool Pool-node1: tslice=5ms ratelimit=1000us
Name                                ID Weight  Cap
u0                                   2    512    0
# xl cpupool-migrate u0 Pool-0
# xl sched-credit
Cpupool Pool-node0: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
Domain-0                             0    256    0
u0                                   2    256    0
Cpupool Pool-node1: tslice=5ms ratelimit=1000us
Name                                ID Weight  Cap

Using cpupool-numa-split

NUMA architecture has a number of performance implications (more info available here). By default, a VM's memory will be striped across all nodes on which it is runnable when it is created.

One simple way to deal with the complexity is to make a separate cpupool for each NUMA node, and when creating VMs, assign them to less busy nodes. Since the VM is restricted to run on only one node, all of its memory will come from that node, guaratneeing that it gets the highest throughput and lowest latency possible (albeit at the expense of some flexibility). In order to achieve this, make sure you assign the VM to the right pool at creation time (either via the config file or command line option).

Because this is such a common thing to want to do, there is a special command to automatically set up pools based on NUMA nodes:

# xl cpupool-numa-split
# xl cpupool-list -c
Name               CPU list
Pool-node0         0,1,2,3
Pool-node1         4,5,6,7