Event Channel Internals

From Xen
Revision as of 11:18, 25 February 2013 by WeiLiu (talk | contribs) (add link to blog post on Xen.org)

General idea

Event channels are the basic primitive provided by Xen for event notifications. An event is the Xen equivalent of a hardware interrupt. They essentially store one bit of information, the event if interest is signalled by transitioning this bit from 0 to 1.

Notifications are received by a guest via an upcall from Xen, indicating when an event arrives (setting the bit). Further notifications are masked until the bit is cleared again (therefore, guests must check the value of the bit after re-enabling event delivery to ensure no missed notifications).

Event notifications can be masked by setting a flag; this is equivalent to disabling interrupts and can be used to ensure atomicity of certain operations in the guest kernel.

There are four kinds of events which can be mapped into event channels in Linux:

  • Inter-domain notifications. This includes all the virtual device events, since they're driven by front-ends in another domain (typically dom0).
  • VIRQs, typically used for timers. These are per-cpu events.
  • IPIs.
  • PIRQs - Hardware interrupts.

Events are stored in a bitmap shared between guest and hypervisor. Several tricks such as N-level search path and per-cpu mask are used to speed up search process.

Xen prior to 4.3 supports 2-level event channel implementation, 3-level event channel implementation is planned for 4.3.

The rationale behind extended event channel ABI is to scale better, as nowadays hardware is getting more and more powerful and can support more DomUs running at the same time. Another approach to address scalability is to disaggregate Dom0 functionalities to separate domains. Here is a write-up on this topic: http://blog.xen.org/index.php/2013/02/21/improving-event-channel-scalability/.

2-level event channel ABI

This is the de-factor implementation for event channel mechanism. It is supported across all Xen releases for compatibility.

32 bit domain supports up to 1024 event channels, 64 bit domain supports up to 4096 event channels.

This implementation utilizes 2-level search path to speed up searching. The first level is a bitset of words which contain pending event bits. The second level is a bitset of pending events themselves.

3-level event channel ABI

This is planned to be introduced in Xen 4.3. However this is supposed to be only used by Dom0 or driver domains as normal DomU will not need so many event channels.

32 bit domain supports up to 32K event channels, 64 bit domain supports up to 256K event channels.

The implementation utilizes 3-level search path to speed up searching. The fist level is a bitset of words which indicates words containing pending bits in the second level. The second level is a bitset of words which indicates words in third level which contain pending events. The third level is a bitset of pending events themselves.

A naive stress test (64 bit Dom0 with 4G RAM) shows that Dom0 gets Out-of-Memory error after allocating ~45K event channels.

FIFO-based event channel ABI

A new FIFO-based event channel was proposed. This new ABI utilizes lock-less queues between Xen and guests to achieve FIFO property. Concurrent access in Xen to the event queues is protected by spin lock. This approach has nice features such as event priority, flexible configuration for number of event channels. This ABI is being discussed on Xen-devel, we may see a prototype soon. Discussion thread http://marc.info/?l=xen-devel&m=136093886617554&w=2.