Intel Platform QoS Technologies

From Xen

Overview

Intel Platform QoS technologies (sometimes also referred to as Intel Platform Shared Resource Monitoring/Control technologies) are designed to help IT managers improve performance and manageability for virtual machines. The following list gives a brief overview, over available functionality and use-cases:

Feature In Description / Use-case
Cache Monitoring Technology (CMT) Xen 4.5 CMT can be used to monitor Last Level Cache (LLC) usage by application threads. With this information, administrators and management applications can balance workloads more efficiently to improve both application performance and physical resource utilization. For example, CMT can be used to reduce the impact of the so-called "noisy neighbour" issue in multitenant cloud and data center environments.

The noisy neighbour is the situation where you have two VMs (or more generally processes), VM A and B. VM A can be noisy in that it runs an algorithm that dirties many entries in the cache, evicting cache entries for VM B and thereby slowing down VM B. Cache Monitoring Technology (CMT) allows users to track which VMs are using how much cache and identify the noisy ones and take corrective action.

Cache Allocation Technology (CAT) Xen 4.6 CAT allows system administrators to assign more L3 cache capacity to individual VMs, resulting in lower latency and higher performance for high-priority workloads such as NFV, real-time and video-on-demand applications.
Memory Bandwidth Monitoring (MBM) Xen 4.6 MBM allows system administrators to identify memory bandwidth saturation on a Xen host that may be caused by several memory-intensive VMs running on the same host. Taking corrective actions, such as migrating VMs to a different Xen host, increases scalability and performance in the data center.
Code and Data Prioritization (CDP) Xen 4.7 Code and Data Prioritization (CDP) Technology is an extension of CAT, which is available on Intel Broadwell and later server platforms. CDP enables isolation and separate prioritization of code and data fetches to the L3 cache in a software configurable manner, which can enable workload prioritization and tuning of cache capacity to the characteristics of the workload. CDP extends Cache Allocation Technology (CAT) by providing separate code and data masks per Class of Service (COS). More info at xl-psr.
Memory Bandwidth Allocation (MBA) Xen 4.11 Memory Bandwidth Allocation (MBA) is a new feature available on Intel Skylake and later server platforms that allows an OS or Hypervisor/VMM to slow misbehaving apps/VMs by using a credit-based throttling mechanism. More info at xl-psr.


Man Pages and Reference

  • [1] : XL Interfaces
  • [2] : XL Man Pages
  • [3] : Architecture Software Development Manual (Vol 3b., Chapter 17.14 and 17.15, covers CMT, CAT, MBM, CDP)
  • [4] : Achieving QoS in Server Virtualization - Intel Platform Shared Resource Monitoring/Control in Xen

Also See:

Cache Monitoring Technology

For more information, see Benefit of Cache Monitoring and an Overview.

Cache Monitoring Information is exposed through XL via

$ xl psr-cmt-attach domid
$ xl psr-cmt-detach domid
$ xl psr-cmt-show cache-occupancy

A detailed description of the commands is available here

Also See:

Memory Bandwidth Monitoring

Memory Bandwidth Monitoring(MBM) is a new hardware feature available on Intel Broadwell and later server platforms which builds on the CMT infrastructure to allow monitoring of system memory bandwidth. It introduces two new monitoring event type to monitor system total/local memory bandwidth. The same RMID can be used to monitor both cache usage and memory bandwidth at the same time.

In Xen's implementation, MBM shares the same set of underlying monitoring service with CMT and can be used to monitor memory bandwidth on a per domain basis.

The xl interfaces are the same with that of CMT. The difference is the monitor type is corresponding memory monitoring type (local-mem-bandwidth/ total-mem-bandwidth instead of cache-occupancy). E.g. after a xl psr-cmt-attach:

$ xl psr-cmt-show local-mem-bandwidth domid
$ xl psr-cmt-show total-mem-bandwidth domid

Also see

Cache Allocation Technology

Cache Allocation Technology (CAT) is a new feature available on Intel Broadwell and later server platforms that allows an OS or Hypervisor/VMM to partition cache allocation (i.e. L3 cache) based on application priority or Class of Service (COS). Each COS is configured using capacity bitmasks (CBM) which represent cache capacity and indicate the degree of overlap and isolation between classes. System cache resource is divided into numbers of minimum portions which is then made up into subset for cache partition. Each portion corresponds to a bit in CBM and the set bit represents the corresponding cache portion is available.

For example, assuming a system with 8 portions and 3 domains:

  • A CBM of 0xff for every domain means each domain can access the whole cache. This is the default.
  • Giving one domain a CBM of 0x0f and the other two domain's 0xf0 means that the first domain gets exclusive access to half of the cache (half of the portions) and the other two will share the other half.
  • Giving one domain a CBM of 0x0f, one 0x30 and the last 0xc0 would give the first domain exclusive access to half the cache, and the other two exclusive access to one quarter each.

For more detailed information please refer to [3] section 17.15 - Platform Shared Resource Control: Cache Allocation Technology.

In Xen's implementation, CBM can be configured with libxl/xl interfaces but COS is maintained in hypervisor only. The cache partition granularity is per domain, each domain has COS=0 assigned by default, the corresponding CBM is all-ones, which means all the cache resource can be used by default.

System CAT information such as maximum COS and CBM length can be obtained by:

$ xl psr-hwinfo --cat

The simplest way to change a domain's CBM from its default is running:

$ xl psr-cat-cbm-set [OPTIONS] <domid> <cbm>

where cbm is a number to represent the corresponding cache subset can be used. A cbm is valid only when:

  • Set bits only exist in the range of [0, cbmlen), where cbmlen can be obtained with xl psr-hwinfo --cat.
  • All the set bits are contiguous.

In a multi-socket system, the same cbm will be set on each socket by default. Per socket cbm can be specified with the --socket SOCKET option.

Setting the CBM may not be successful if insufficient COS is available. In such case unused COS(es) may be freed by setting CBM of all related domains to its default value(all-ones).

Per domain CBM settings can be shown by:

$ xl psr-cat-show

Code and Data Prioritization (CDP)

Code and Data Prioritization (CDP) Technology is an extension of CAT, which is available on Intel Broadwell and later server platforms. CDP enables isolation and separate prioritization of code and data fetches to the L3 cache in a software configurable manner, which can enable workload prioritization and tuning of cache capacity to the characteristics of the workload. CDP extends Cache Allocation Technology (CAT) by providing separate code and data masks per Class of Service (COS). More info at xl-psr.

Also See: