Difference between revisions of "Archived/Xen Development Projects"

From Xen
m
 
(50 intermediate revisions by 12 users not shown)
Line 1: Line 1:
  +
__TOC__
__NOTOC__
 
{{sidebar
 
| name = Content
 
   
| outertitlestyle = text-align: left;
 
| headingstyle = text-align: left;
 
| contentstyle = text-align: left;
 
 
| content1 = __TOC__
 
}}
 
 
This page lists various Xen related development projects that can be picked up by anyone! If you're interesting in hacking Xen this is the place to start! Ready for the challenge?
 
This page lists various Xen related development projects that can be picked up by anyone! If you're interesting in hacking Xen this is the place to start! Ready for the challenge?
   
Line 28: Line 20:
 
== List of projects ==
 
== List of projects ==
 
=== Domain support ===
 
=== Domain support ===
{{project
 
|Project=Upstreaming Xen PVSCSI drivers to mainline Linux kernel
 
|Date=01/08/2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|GSoC=No
 
|Desc=
 
PVSCSI drivers needs to be upstreamed yet. Necessary operations may include:
 
* Task 1: Upstream PVSCSI scsifront frontend driver (for domU).
 
* Task 2: Upstream PVSCSI scsiback backend driver (for dom0).
 
* Send to various related upstream mailinglists for review, comments.
 
* Fix any upcoming issues.
 
* Repeat until merged to upstream Linux kernel git tree.
 
* http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=shortlog;h=refs/heads/devel/xen-scsi.v1.0
 
* More info: http://wiki.xen.org/xenwiki/XenPVSCSI
 
}}
 
 
{{project
 
|Project=Upstreaming Xen PVUSB drivers to mainline Linux kernel
 
|Date=01/08/2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|GSoC=No, unless Konrad believes these can be done
 
|Desc=
 
PVUSB drivers needs to be upstreamed yet. Necessary operations may include:
 
* Upstream PVUSB usbfront frontend driver (for domU).
 
* Upstream PVUSB usbback backend driver (for dom0).
 
* Send to various related upstream mailinglists for review, comments.
 
* Fix any upcoming issues.
 
* Repeat until merged to upstream Linux kernel git tree.
 
* http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=shortlog;h=refs/heads/devel/xen-usb.v1.1
 
* More info: http://wiki.xen.org/xenwiki/XenUSBPassthrough
 
{{Comment|[[User:Lars.kurth|Lars.kurth]] 14:14, 23 January 2013 (UTC):}} Would also need more detail
 
}}
 
 
{{project
 
|Project=Implement Xen PVSCSI support in xl/libxl toolstack
 
|Date=01/12/2012
 
|Contact=Pasi Karkkainen <pasik@iki.fi>
 
|GSoC=Yes
 
|Desc=
 
xl/libxl does not currently support Xen PVSCSI functionality. Port the feature from xm/xend to xl/libxl. Necessary operations include:
 
* Task 1: Implement PVSCSI in xl/libxl, make it functionally equivalent to xm/xend.
 
* Send to xen-devel mailinglist for review, comments.
 
* Fix any upcoming issues.
 
* Repeat until merged to xen-unstable.
 
* See above for PVSCSI drivers for dom0/domU.
 
* Xen PVSCSI supports both PV domUs and HVM guests with PV drivers.
 
* More info: http://wiki.xen.org/xenwiki/XenPVSCSI
 
{{Comment|[[User:Lars.kurth|Lars.kurth]] 14:14, 23 January 2013 (UTC):}} Should be suitable, but desc needs. Rate in terms of challenges, size and skill. Also kernel functionality is not yet upstreamed. Maybe Suse kernel.
 
}}
 
 
{{project
 
|Project=Implement Xen PVUSB support in xl/libxl toolstack
 
|Date=01/12/2012
 
|Contact=Pasi Karkkainen <pasik@iki.fi>
 
|GSoC=Yes
 
|Desc=
 
xl/libxl does not currently support Xen PVUSB functionality. Port the feature from xm/xend to xl/libxl. Necessary operations include:
 
* Task 1: Implement PVUSB in xl/libxl, make it functionally equivalent to xm/xend.
 
* Send to xen-devel mailinglist for review, comments.
 
* Fix any upcoming issues.
 
* Repeat until merged to xen-unstable.
 
* See above for PVUSB drivers for dom0/domU.
 
* Xen PVUSB supports both PV domUs and HVM guests with PV drivers.
 
* More info: http://wiki.xen.org/xenwiki/XenUSBPassthrough
 
{{Comment|[[User:Lars.kurth|Lars.kurth]] 14:14, 23 January 2013 (UTC):}} Should be suitable, but desc needs. Rate in terms of challenges, size and skill. Also kernel functionality is not yet upstreamed. Maybe Suse kernel.
 
}}
 
 
{{project
 
|Project=Block backend/frontend improvements
 
|Date=01/01/2013
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
Blkback requires a number of improvements, some of them being:
 
* Multiple disks in a guest cause contention in the global pool of pages.
 
* There is only one ring page and with SSDs nowadays we should make this larger, implementing some multi-page support.
 
* With multi-page it becomes apparent that the segment size ends up wasting a bit of space on the ring. BSD folks fixed that by negotiating a new parameter to utilize the full size of the ring. Intel had an idea for descriptor page.
 
* Add DIF/DIX support [http://oss.oracle.com/~mkp/docs/lpc08-data-integrity.pdf] for T10 PI (Protection Information), to support data integrity fields and checksums.
 
* Further perf evaluation needs to be done to see how it behaves under high load.
 
* Further discussion and issues outlined in http://lists.xen.org/archives/html/xen-devel/2012-12/msg01346.html
 
|GSoC=Yes, but we would have to chop them in a nice chunks
 
}}
 
 
{{project
 
|Project=Netback overhaul
 
|Date=02/08/2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
Wei Liu posted RFC patches that make the driver be multi-page, multi-event channel and with a page-pool. However not all the issues have been addressed yet, meaning that the patches need to be finished and cleaned up yet. Additively, a zero-copy implementation can be considered. Patch series and discussions:
 
* http://lists.xen.org/archives/html/xen-devel/2012-01/msg02561.html
 
* http://www.spinics.net/lists/linux-nfs/msg22575.html
 
}}
 
 
{{project
 
|Project=Multiqueue support for Xen netback/netfront in Linux kernel
 
|Date=01/22/2013
 
|Contact=Wei Liu <wei.liu2@citrix.com>
 
|Desc=
 
Please consider this project a sub-project of Netback overhaul by Konrad Rzeszuteck Wilk. Originally posted by Pasik, elaborated by Wei.
 
Multiqueue support allows a single virtual network interface (vif) to scale to multiple vcpus. Each queue has it's own interrupt, and thus can be bind to a different vcpu. KVM VirtIO, VMware VMXNet3, tun/tap and various other drivers already support multiqueue in upstream Linux.
 
 
Some general info about multiqueue: http://lists.linuxfoundation.org/pipermail/virtualization/2011-August/018247.html
 
In the current implementation of Xen PV network, every vif is equipped with only one TX/RX ring pair and one event channel, which does not scale when a guest has multiple vcpus. If we need to utilize all vcpus to do network job then we need to configure multiple vifs and bind interrupts to vcpus manuals. This is not ideal and involves too much configuration.
 
The multiqueue support in Xen vif should be straight forward. It requires changing the current vif protocol and the code used to initialize / connect / reconnect vifs. However, there are risks in terms of collaboration, it is possible multiple parties will work on same piece of code. Here are possible obstacles and thoughts:
 
* netback worker model change - the possible change is from M:N to 1:1 is not really an obstacle because 1:1 is just a special case for M:N
 
* netback page allocation mechanism change - not likely to have protocol change
 
* netback zero-copy - not likely to have protocol change
 
* receiver-side copy - touches both protocol and implementation,
 
* multi-page ring - touches protocol and implementation, should be easy to merge
 
* split event channel - touches protocol and implementation, should be easy to merge
 
The basic requirement for this project is Linux kernel programming skill, knowledge of Xen PV device model. The candidate for this project should be familiar with open source development workflow as it may require collaboration with several parties.
 
|Outcomes=Expected outcome:
 
* have multi-queue patch ready to upstream or upstreamed
 
* benchmark report (basic: compare single-queue / multi-queue vif. advanced: compare Xen multi-queue vif against KVM multi-queue VirtIO etc.)
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=Block backend/frontend improvements
 
|Date=01/01/2013
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
Blkback requires a number of improvements, some of them being:
 
* Multiple disks in a guest cause contention in the global pool of pages.
 
* There is only one ring page and with SSDs nowadays we should make this larger, implementing some multi-page support.
 
* With multi-page it becomes apparent that the segment size ends up wasting a bit of space on the ring. BSD folks fixed that by negotiating a new parameter to utilize the full size of the ring. Intel had an idea for descriptor page.
 
* Add DIF/DIX support [http://oss.oracle.com/~mkp/docs/lpc08-data-integrity.pdf] for T10 PI (Protection Information), to support data integrity fields and checksums.
 
* Further perf evaluation needs to be done to see how it behaves under high load.
 
* Further discussion and issues outlined in http://lists.xen.org/archives/html/xen-devel/2012-12/msg01346.html
 
|GSoC=Yes, but we would have to chop them in a nice chunks
 
}}
 
   
 
{{project
 
{{project
 
|Project=Utilize Intel QuickPath on network and block path.
 
|Project=Utilize Intel QuickPath on network and block path.
 
|Date=01/22/2013
 
|Date=01/22/2013
  +
|Difficulty=High
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=The Intel QuickPath, also known as Direct Cache Access, is the chipset that sits in the PCIe subsystem in the Intel systems. It allows the PCIe subsystem to tag which PCIe writes to memory should reside in the Last Level Cache (LLC, also known as L3, which in some cases can be of 15MB or 2.5MB per CPU). This offers incredible boosts of speed - as we bypass the DIMMs and instead the CPU can process the data in the cache.
 
|Desc=The Intel QuickPath, also known as Direct Cache Access, is the chipset that sits in the PCIe subsystem in the Intel systems. It allows the PCIe subsystem to tag which PCIe writes to memory should reside in the Last Level Cache (LLC, also known as L3, which in some cases can be of 15MB or 2.5MB per CPU). This offers incredible boosts of speed - as we bypass the DIMMs and instead the CPU can process the data in the cache.
   
Adding this component in the network or block backends can mean that we can keep the data longer in the cache
+
Adding this component in the network or block backends can mean that we can keep the data longer in the cache and the guest can process the data right off the cache.
and the guest can process the data right off the cache.
 
   
 
|Skills=The basic requirement for this project is Linux kernel programming skill.
 
|Skills=The basic requirement for this project is Linux kernel programming skill.
Line 177: Line 39:
 
}}
 
}}
   
{{project
 
|Project=perf working with Xen
 
|Date=01/01/2013
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
The perf framework in the Linux kernel is incredibly easy to use and very good at spotting issues. It
 
currently works great inside HVM guests, but fails short of working properly in PV and also collecting the hypervisor's EIPs. This work would be involved in closing that gap by implementing an 'perf --xen' option which would allow perf to collect the counter's information system-wide. Part of this work is figuring out whether we can piggyback on the oprofile hypercalls and in the Linux kernel side transmute the oprofile hypercalls <=> msr values (which is what perf uses).
 
   
|Outcomes=Expected outcome:
 
* Upstream patches
 
* perf --xen top working.
 
|GSoC=Yes, perhaps?
 
}}
 
 
{{project
 
{{project
  +
|Project=Enabling the 9P File System transport as a paravirt device
|Project=PAT writecombine fixup
 
|Date=02/08/2012
+
|Date=01/20/2014
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
|Contact=Andres Lagar-Cavilla <andres@lagarcavilla.org>
|Desc=
+
|GSoC=Yes
  +
|Desc=VirtIO provides a 9P FS transport, which is essentially a paravirt file system device. VMs can mount arbitrary file system hierarchies exposed by the backend. The 9P FS specification has been around for a while, while the VirtIO transport is relatively new. The project would consist of implementing a classic Xen front/back pv driver pair to provide a transport for the 9P FS Protocol.
The writecombine feature (especially for graphic adapters) is turned off due to stability reasons. More specifically, the code involved in page transition from WC to WB gets confused about the PSE bit state in the page, resulting in a set of repeated warnings.
 
For more informations please check:
 
* Linux git revision 8eaffa67b43e99ae581622c5133e20b0f48bcef1
 
* http://lists.xen.org/archives/html/xen-devel/2012-06/msg01950.html
 
   
  +
* More info: http://www.linux-kvm.org/page/9p_virtio
The goal is to create in the Linux kernel a general PAT lookup system where it can determine which page bits
 
need to be enabled for WB, UC, and WC. This would allow different programming of the PAT entries during bootup.
 
}}
 
   
  +
|Skills= Required skills include knowledge of kernel hacking, file system internals. Desired skills include: understanding of Xen PV driver structure, and VirtIO.
   
  +
|Outcomes=Expected outcome:
  +
* LKML patches for front and back end drivers.
  +
* In particular, domain should be able to boot from the 9P FS.
  +
}}
   
 
{{project
 
{{project
  +
|Project=OVMF Compatibility Support Module support in Xen
|Project=Parallel xenwatch
 
|Date=01/08/2012
+
|Date=2/5/2014
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
|Contact=Wei Liu <wei.liu2@citrix.com>
  +
|GSoC=Yes
  +
|Difficulty=Easy
 
|Desc=
 
|Desc=
  +
Currently Xen supports booting HVM guest with Seabios and OVMF UEFI firmware, but those are separate binaries. OVMF supports adding legacy BIOS blob in its binary with Compatibility Support Module support. We can try to produce single OVMF binary with Seabios in it, thus having only one firmware binary.
  +
  +
Tasks may include:
  +
* figure out how CSM works
  +
* design / implement interface between Hvmloader and the unified binary
   
Xenwatch is locked with a coarse lock. For a huge number of guests this represents a scalability issue. The need is to rewrite the xenwatch locking in order to support full scalability.
 
 
}}
 
}}
   
 
{{project
 
{{project
  +
|Project=Improvements to firmware handling HVM guests
|Project=dom0 kgdb support
 
|Date=01/23/2012
+
|Date=07/16/2015
|Contact=Ben Guthro <benjamin.guthro@citrix.com>
+
|Contact=Andrew Cooper <andrew.cooper3@citrix.com>
  +
|GSoC=Yes
  +
|Difficulty=Easy
  +
|Skills Needed=Gnu toolchain, Familiarity with Multiboot, C
 
|Desc=
 
|Desc=
  +
Currently, all firmware is compiled into HVMLoader.
   
  +
This works, but is awkward when used using a single distro seabios/ovmf designed for general use. In such a case, any time an update to seabios/ovmf happens, hvmloader must be rebuilt.
A preliminary implementation of kgdb looks promising to be able to debug dom0, but vcpu round-up seems to not be happening properly.
 
A debug effort / end-goal would be to properly round-up dom0's vcpus, and be able to use the kgdb functionality to debug
 
   
  +
The purpose of this project is to alter hvmloader to take firmware blobs as a multiboot module rather than requiring them to be built in. This reduces the burden of looking after Xen in a distro environment, and will also be useful for developers wanting to work with multiple versions of firmware.
Original reference thread:
 
  +
http://markmail.org/thread/5lgrhs47u54zuvkv
 
  +
As an extension, support loading an OVMF NVRAM blob. This enabled EFI NVRAM support for guests.
 
}}
 
}}
   
 
=== Hypervisor ===
 
=== Hypervisor ===
{{project
 
|Project=Microcode uploader implementation
 
|Date=02/08/2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
Intel is working on early implementation where the microcode blob would be appended to the initrd image. The kernel would scan for the appropiate magic constant and load the microcode very early.
 
 
The Xen hypervisor can do this similarly. [[GSoC_2013#microcode-uploader]]
 
|GSoC=Yes
 
}}
 
   
 
{{project
 
{{project
Line 253: Line 103:
 
and to make it interface as well as possible with the Linux PowerClamp tools, so that the same tools could be used
 
and to make it interface as well as possible with the Linux PowerClamp tools, so that the same tools could be used
 
for both. [[GSoC_2013#powerclamp-for-xen]]
 
for both. [[GSoC_2013#powerclamp-for-xen]]
|GSoC=Yes
 
}}
 
 
 
{{project
 
|Project=Is Xen ready for the Real-Time/Embedded World?
 
|Date=08/08/2012
 
|Contact=Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=
 
No matter if it is to build a [http://gigaom.com/2011/06/25/mobile-virtualization-finds-its-home-in-the-enterprise/ multi-personallity mobile phone], or [http://www.youtube.com/watch?v=j4uMdROzEGI help achieving consolidation in industrial and factory automation], embedded virtualization ([http://en.wikipedia.org/wiki/Embedded_hypervisor [1]], [http://www.ibm.com/developerworks/linux/library/l-embedded-virtualization/index.html [2]], [http://www.wirevolution.com/2012/02/18/mobile-virtualization/ [3]]) is upon us. In fact, quite a number of ''embedded hypervisors'' already exist, e.g.: [http://www.windriver.com/products/hypervisor/ Wind River Hypervisor], [http://dev.b-labs.com/ CodeZero] or [http://www.sysgo.com/products/pikeos-rtos-and-virtualization-concept/ PikeOS]. Xen definitely '''is''' ''small, fast type-1 hypervisor with support for multiple VMs'' [http://en.wikipedia.org/wiki/Embedded_hypervisor [1]], so it could be a good candidate embedded hypervisor.
 
 
Moreover, Xen offers with an implementation of one of the most famous and efficient real-time scheduling algorithm, the [http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling Earliest Deadline First] (which is called SEDF in Xen), and real-time support is a key feature for a successful embedded hypervisor. Using such an advanced scheduling policy is, if it is implemented correctly, a great advancement and provide much more flexibility than only using vCPU pinning (which is what most embedded hypervisors do to guarantee real-time performances and isolation).
 
 
In the context of embedded virtualization, providing low latency (when reacting to external events) to the hosted VMs, and a good level of '''temporal isolation''' among the different hosted VMs is something fundamental. This means not only the failures in any VM should not affect any other one, but also that, independently of what a VM is doing, it shouldn't be possible for it to cause any other VM to suffer from higher (and potentially dangerous!) latency and/or impairing the capability of any other VM to respect its own real-time constraints.
 
 
Whether or not Xen is able to provide its (potential) real-time VMs with all this, is something which has not been investigated thoroughly enough yet. This project therefore aims at defining, implementing and executing the proper set of tests, becnhmarks and measuring to understand where Xen stands, identify the sources of inefficiencies and alleviate or remove them. The very first step would be to check how much latency Xen introduces, by running some typical real-time workload within a set of VM, under different host and guest load conditions (e.g., by using [https://rt.wiki.kernel.org/index.php/Cyclictest cyclictest] within the VM themselves). Results can then be compared to what it is achievable with other hypervisors. After this, the code paths contributing most to latency and/or disrupting temporal isolation will need to be identified, and proper solutions to mitigate their effect envisioned and implemented.
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=Implement Temporal Isolation and Multiplocessor Support in the SEDF Scheduler
 
|Date=08/08/2012
 
|Contact=Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=
 
No matter if it is to build a [http://gigaom.com/2011/06/25/mobile-virtualization-finds-its-home-in-the-enterprise/ multi-personallity mobile phone], or [http://www.youtube.com/watch?v=j4uMdROzEGI help achieving consolidation in industrial and factory automation], embedded virtualization ([http://en.wikipedia.org/wiki/Embedded_hypervisor [1]], [http://www.ibm.com/developerworks/linux/library/l-embedded-virtualization/index.html [2]], [http://www.wirevolution.com/2012/02/18/mobile-virtualization/ [3]]) is upon us. In fact, quite a number of ''embedded hypervisors'' already exist, e.g.: [http://www.windriver.com/products/hypervisor/ Wind River Hypervisor], [http://dev.b-labs.com/ CodeZero] or [http://www.sysgo.com/products/pikeos-rtos-and-virtualization-concept/ PikeOS]. Xen definitely '''is''' ''small, fast type-1 hypervisor with support for multiple VMs'' [http://en.wikipedia.org/wiki/Embedded_hypervisor [1]], so it could be a good candidate embedded hypervisor.
 
 
Moreover, Xen offers with an implementation of one of the most famous and efficient real-time scheduling algorithm, the [http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling Earliest Deadline First] (which is called SEDF in Xen), and real-time support is a key feature for a successful embedded hypervisor. Using such an advanced scheduling policy is, if it is implemented correctly, a great advancement and provide much more flexibility than only using vCPU pinning (which is what most embedded hypervisors do to guarantee real-time performances and isolation).
 
 
However, SEDF, the EDF implementation in Xen, is there, suffers from some rough edges. In fact, as of now, SEDF deals with events such as a vCPU blocking --in general, stopping running-- and unblocking --in general, restarting running-- by trying (and failing!) to special case all the possible situations, resulting in the code being rather complicated, ugly, inefficient and hard to maintain. Unified approaches have been proposed for enabling blocking and unblocking in EDF, while still guaranteeing temporal isolation among different vCPUs. It also lacks pproper multiprocessor support, meaning that it does not properly handle SMP systems, unless vCPU are specifically and statically pinned by the user. This is a big limitation of the current implementation, especially since EDF can work well without the need of imposing this constraint, providing much more flexibility and efficiency in exploiting the system resources to their most.
 
 
Therefore, this project aims at extending the SEDF scheduler, by turning it into a proper multiprocessor and temporal isolation enabled scheduling solution.
 
[[GSoC_2013#sedf-improvements]]
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=Virtual NUMA topology exposure to VMs
 
|Date=12/12/2012
 
|Contact=Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple ''nodes''. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.
 
 
Ideally, each VM should have its memory allocated out of just one node, and if its VCPUs are always run there, both throughput and latency are kept to the maximum possible level. However, there are a number of reasons for this not to be possible. In such case, i.e., if a VM ends up on having its memory and executing consistently on multiple nodes, we should make sure it knows it's running on a NUMA platform --a smaller one than the actual host, but still NUMA. This is something very important for some specific workloads, for instance, all the HPC ones. In fact, it the guest OS (and application) has any NUMA support, exporting a virtual topology to the guest is the only way to render that effective, and perhaps filling, at least to some extent the gap introduced by the needs of distributing the guests on more than one node. Just for reference, this feature, under the name of vNUMA, is one of the key and most advertised ones of VMWare vSphere 5 ("vNUMA: what it is and why it matters").
 
 
This project aims at introducing virtual NUMA in Xen. This has some non-trivial interaction with other aspects of the NUMA support of Xen itself, namely automatic placement at VM creation time, dynamic memory migration among nodes, and others, meaning that some design decision needs to be made. After that, virtual topology exposure will be implemented for all the kind of guests supported by Xen.
 
 
This project fits in the efforts the Xen community is making for improving the performances of Xen on NUMA systems. The full roadmap is available on this Wiki page: [[Xen NUMA Roadmap]]
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=NUMA and ballooning on Xen
 
|Date=12/12/2012
 
|Contact=Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple ''nodes''. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.
 
 
When it comes to memory, Xen offers a set of different mechanisms for over-committing the host memory, the most common, widely known and utililsed is ballooning. This has non-trivial interference with NUMA friendliness. For instance, when freeing some memory, Xen currently tries to balloon down existing guests, but that happens without any knowledge or consideration of on which node(s) the freed memory will end up being. As a result, we may be able to create the new domain, but not quite as able to place all its memory on a single node (because ballooning could well have freed half of the space on a node, and half on another). It would be much better if we could at least try to make space "node-wise", i.e., trying to balloon down those guests that would allow the new one to fit into a node. Basically, we need to teach the whole ballooning mechanism about NUMA.
 
 
If possible/interesting, also the other technologies for memory overcommitment could be touched during this project. Page sharing, for instance. In fact, sharing pages between guests residing on different nodes is, in general, a bad idea, but there is nothing preventing for this to happen right now, and so that is what a candidate should design and implement.
 
 
This project fits in the efforts the Xen community is making for improving the performances of Xen on NUMA systems. The full roadmap is available on this Wiki page: [[Xen NUMA Roadmap]]
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=NUMA effects on inter-VM communication and on multi-VM workloads
 
|Date=12/12/2012
 
|Contact=Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple ''nodes''. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.
 
 
If a workload is made up of more than just a VM, running on the same NUMA host, it might be best to have two (or more) VMs share a node, as well as right the opposite, depending on the specific characteristics of he workload itself, and this might be considered during placement, memory migration and perhaps scheduling.
 
 
The idea is that sometimes you have a bunch of VMs that would like to ''cooperate'', and sometimes you have a bunch of VMs that would like to be kept as apart as possible from other VMs (''competitive''). In the ''cooperative'' VMs scenario, one wants to optimize for data flowing across VMs in the same host, e.g., because a lot of data copying is involved (a WebApp and DB VMs working together). This means trying to have VMs sharing data in the same node and, if possible, even in the same PCPU's caches, in order to maximize the memory throughput between the VMs. On the other hand, in the ''competitive'' VMs scenario, one wants to optimize for data flowing between the VMs and outside the host (e.g., when PCI-passthrough is used for NICs). In this case it would be a lot better for these VMs to use memory from different nodes and avoid wasting each other cache lines.
 
 
This project aims at making it possible for the Xen virtualization platform (intended as hypervisor + toolstack) to take advantage of this knowledge about the characteristics of the workload and use it to maximize performances. A first step would be to enhance the automatic NUMA placement algorithm to consider the ''cooperative''-ness and/or the ''competitive''-ness of a VM during placement itself, if provided with such information by the user. A much more complicated development could be to have this relationship between the various running VMs guessed automatically on-line (e.g., by watching the memory mappings and looking for specific patterns), and update the situation accordingly.
 
 
This project fits in the efforts the Xen community is making for improving the performances of Xen on NUMA systems. The full roadmap is available on this Wiki page: [[Xen NUMA Roadmap]]
 
 
|GSoC=Yes
 
|GSoC=Yes
 
}}
 
}}
Line 335: Line 109:
 
|Project=Integrating NUMA and Tmem
 
|Project=Integrating NUMA and Tmem
 
|Date=08/08/2012
 
|Date=08/08/2012
|Contact=Dan Magenheimer <[mailto:dan.magenheimer_AT_oracle_DOT_com dan.magenheimer_AT_oracle_DOT_com]>, Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
+
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Dario Faggioli <[mailto:dario.faggioli@citrix.com dario.faggioli@citrix.com]>
 
|Desc=NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple ''nodes''. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.
 
|Desc=NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple ''nodes''. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.
   
Line 342: Line 116:
 
For instance, implementing something like <code>alloc_page_on_any_node_but_the_current_one()</code> (or <code>any_node_except_this_guests_node_set()</code> for multinode guests), and have Xen's Tmem implementation use it (especially in combination with selfballooning), could solve a significant part of the NUMA problem when running Tmem-enabled guests.
 
For instance, implementing something like <code>alloc_page_on_any_node_but_the_current_one()</code> (or <code>any_node_except_this_guests_node_set()</code> for multinode guests), and have Xen's Tmem implementation use it (especially in combination with selfballooning), could solve a significant part of the NUMA problem when running Tmem-enabled guests.
 
}}
 
}}
 
{{project
 
|Project=IOMMU control for SWIOTLB, to avoid dom0 copy of all >4K DMA allocations
 
|Date=01/30/2013
 
|Contact=Andy Cooper <andrew.cooper3@citrix.com>
 
|Difficulty=High
 
|Skills=C, Xen and Linux kernel knowledge
 
|Desc=With VT-d/SVM-vi capable systems, this would cause far less overhead (1 cpu copy in both directions), and allow the use of 9K MTU at a sensible rate. This would allow for better I/O with modern hardware.
 
|Outcomes=TBD
 
|GSoC=yes}}
 
 
{{project
 
|Project=HVM per-event-channel interrupts
 
|Date=01/30/2013
 
|Contact=Paul Durrant <''first.last''@citrix.com>
 
|Difficulty=
 
|Skills=C, some prior knowledge of Xen useful
 
|Desc=Windows PV drivers currently have to multiplex all event channel processing onto a single interrupt which is registered with Xen using the HVM_PARAM_CALLBACK_IRQ parameter. This results in a lack of scalability when multiple event channels are heavily used, such as when multiple VIFs in the VM as simultaneously under load.
 
 
Goal: Modify Xen to allow each event channel to be bound to a separate interrupt (the association being controlled by the PV drivers in the guest) to allow separate event channel interrupts to be handled by separate vCPUs. There should be no modifications required to the guest OS interrupt logic to support this (as there is with the current Linux PV-on-HVM code) as this will not be possible with a Windows guest.
 
|Outcomes=Code is submitted to xen-devel@xen.org for inclusion in xen-unstable
 
|GSoC=yes}}
 
   
 
=== Userspace Tools ===
 
=== Userspace Tools ===
   
 
{{project
 
|Project=Convert PyGrub to C
 
|Date=15/11/2012
 
|Contact=Roger Pau Monné <[mailto:roger.pau@citrix.com roger.pau@citrix.com]>
 
|Desc=
 
With the replacement of xend with xl/libxl, PyGrub is the only remaining Python userspace component of the Xen tools. Since it already depends on a C library (libfsimage), converting it to C code should not be a huge effort. PyGrub Python code is no more than a parser to grub and syslinux configuration files.
 
 
Some embedded distros (mainly Alpine Linux) already mentioned it's interest in dropping the Python package as a requirement for a Dom0, this will make a Xen Dom0 much more smaller.
 
|GSoC=Yes
 
}}
 
   
 
{{project
 
{{project
Line 414: Line 155:
 
}}
 
}}
   
{{project
 
|Project=VM Snapshots
 
|Date=16/01/2013
 
|Contact=<[mailto:stefano.stabellini@eu.citrix.com Stefano Stabellini]>
 
|Desc=Although xl is capable of saving and restoring a running VM, it is not currently possible to create a snapshot of the disk together with the rest of the VM.
 
 
QEMU is capable of creating, listing and deleting disk snapshots on QCOW2 and QED files, so even today, issuing the right commands via the QEMU monitor, it is possible to create disk snapshots of a running Xen VM. xl and libxl don't have any knowledge of these snapshots, don't know how to create, list or delete them.
 
 
This project is about implementing disk snapshots support in libxl, using the QMP protocol to issue commands to QEMU. Users should be able to manage the entire life-cycle of their disk snapshots via xl. The candidate should also explore ways to integrate disk snapshots into the regular Xen save/restore mechanisms and provide a solid implementation for xl/libxl.
 
 
[[GSoC_2013#vm-snapshots]]
 
|GSoC=Yes
 
}}
 
   
 
{{project
 
{{project
Line 476: Line 204:
 
|Project=KDD (Windows Debugger Stub) enhancements
 
|Project=KDD (Windows Debugger Stub) enhancements
 
|Date=01/30/2013
 
|Date=01/30/2013
|Contact=Santosh Jodh <''first.last''@citrix.com>
+
|Contact=Paul Durrant <paul.durrant@citrix.com>
 
|Difficulty=Medium
 
|Difficulty=Medium
 
|Skills=C, Kernel Debuggers, Xen, Windows
 
|Skills=C, Kernel Debuggers, Xen, Windows
Line 493: Line 221:
 
|GSoC=yes}}
 
|GSoC=yes}}
   
=== Performance ===
 
 
{{project
 
{{project
|Project=Performance tools overhaul
+
|Project=Lazy restore using memory paging
|Date=02/08/2012
+
|Date=01/20/2014
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
|Contact=Andres Lagar-Cavilla <andres@lagarcavilla.org>
|Desc=
+
|GSoC=Yes
  +
|Desc=VM Save/restore results in a boatload of IO and non-trivial downtime as the entire memory footprint of a VM is read from IO.
Generally, works on the performance tool themselves should be listes separately to the [[Xen_Profiling:_oprofile_and_perf]] wiki page.
 
}}
 
   
  +
Xen memory paging support in x86 is now mature enough to allow for lazy restore, whereby the footprint of a VM is backfilled while the VM executes. If the VM hits a page not yet present, it is eagerly paged in.
{{project
 
|Project=Create a tiny VM for easy load testing
 
|Date=01/30/2013
 
|Contact=Dave Scott <''first.last''@citrix.com>
 
|Difficulty=Medium
 
|Skills=OCaml, C
 
|Desc=The http://www.openmirage.org/ framework can be used to create tiny 'exokernels': entire software stacks which run directly on the xen hypervisor. These VMs have such a small memory footprint (16 MiB or less) that many of them can be run even on relatively small hosts. The goal of this project is to create a specific 'exokernel' that can be configured to generate a specific I/O pattern, and to create configurations that mimic the boot sequence of Linux and Windows guests. The resulting exokernel will then enable cheap system load testing.
 
|Outcomes=1. a repository containing an 'exokernel' (see http://github.com/mirage/mirage-skeleton)
 
2. at least 2 I/O traces, one for Windows boot and one for Linux boot (any version)
 
|GSoC=yes}}
 
   
  +
There has been some concern recently about the lack of docs and/or mature tools that use xen-paging. This is a good way to address the problem.
=== Upstream bugs! ===
 
{{project
 
|Project=VCPU hotplug bug
 
|Date=Sep 1 2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
VCPU hotplug. [https://lkml.org/lkml/2012/4/30/198]
 
To get this its as easy as having this in your guest config:
 
<pre>
 
vcpus=2
 
maxvcpus=3
 
</pre>
 
And when you launch the guest to play with 'xm vcpu-set 0 2', xm vcpu-set 0 3'
 
and see the guest forget about one of the CPUs.
 
   
  +
|Skills= A good understanding of save/restore, and virtualized memory management (e.g. EPT, shadow page tables, etc). In principle the entire project can be implemented in user-space C code, but it may be the case that new hypercalls are needed for performance reasons.
This is what you will see in the guest:
 
   
  +
|Outcomes=Expected outcome:
udevd-work[2421]: error opening ATTR{/sys/devices/system/cpu/cpu2/online} for writing: No such file or directory
 
  +
* Mainline patches for libxc and libxl
  +
}}
   
If you instrument udev and look at the code you will find somebody came up with
 
a "fix": http://serverfault.com/questions/329329/pv-ops-kernel-ignoring-cpu-hotplug-under-xen-4-domu
 
 
But the real fix is what Greg outlines in the URL above.
 
}}
 
 
{{project
 
{{project
|Project=RCU timer sent to offline VCPU
+
|Project=CPUID Programming for Humans
|Date=Sep 1 2012
+
|Date=02/04/2014
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
|Contact=Andres Lagar-Cavilla <andres@lagarcavilla.org>
|Desc=
+
|GSoC=Yes
  +
|Desc=When creating a VM, a policy is applied to mask certain CPUID features. Right now it's black magic.
   
  +
THe KVM stack has done an excellent job of making this human-useable, and understandable.
<pre>
 
[ 0.073006] WARNING: at /home/konrad/ssd/linux/kernel/rcutree.c:1547 __rcu_process_callbacks+0x42e/0x440()
 
[ 0.073008] Modules linked in:
 
[ 0.073010] Pid: 12, comm: migration/2 Not tainted 3.5.0-rc2 #2
 
[ 0.073011] Call Trace:
 
[ 0.073017] <IRQ> [<ffffffff810718ea>] warn_slowpath_common+0x7a/0xb0
 
</pre>
 
which I get with this guest config:
 
<pre>
 
vcpus=2
 
maxvcpus=3
 
</pre>
 
   
  +
For example, in a qemu-kvm command-line you may encounter:
   
  +
-cpu SandyBridge,+pdpe1gb,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme
Here is what Paul says: https://lkml.org/lkml/2012/6/19/360
 
}}
 
   
  +
And in <qemu>/target-i386.c you find a fairly comprehensive description of x86 processor models, what CPUID features are inherent, and what CPUID feature each of these symbolic flags enables.
   
  +
In the Xen world, there is a libxc interface to do the same, although it's all hex and register driven. It's effective, yet horrible to use.
{{project
 
|Project=CONFIG_NUMA on 32-bit.
 
|Date=Sep 1 2012
 
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 
|Desc=
 
   
  +
An ideal outcome would have libxl config files and command line absorb a similarly human-friendly description of the CPUID features a user wishes for the VM, and interface appropriately with libxl. Further, autodetection of best CPUID shuold yield a human-readable output to be able to easily understand what the VM thinks about its processor.
http://www.spinics.net/lists/kernel/msg1350338.html
 
   
  +
Finally, interfacing with libvirt should be carefully considered.
I came up with a patch for the problem that William found:
 
http://lists.xen.org/archives/html/xen-devel/2012-05/msg01963.html
 
   
  +
CPUID management is crucial in a heterogeneous cluster where migrations and save restore require careful processor feature selection to avoid blow-ups.
and narrowed it down the Linux calling xen_set_pte with a PMD flag
 
(so trying to setup a 2MB page). Currently the implemenation of xen_set_pte
 
can't do 2MB but it will gladly accept the argument and the multicall will
 
fail.
 
   
  +
|Skills= A good understanding of C user-land programming, and the ability to dive into qemu/libvirt (for reference code and integration), as well as libxc and libxl (for implementation).
Peter did not like the x86 implemenation so I was thinking to implement
 
  +
some code in xen_set_pte that will detect that its a PMD flag and do
 
  +
|Outcomes=Expected outcome:
"something". That something could be either probe the PTE's and see if there
 
  +
* Mainline patches for libxl
is enough space and if so just call the multicall 512 times, or perform
 
a hypercall to setup a super-page. .. But then I wasn't sure how we would
 
tear down such super-page.
 
 
}}
 
}}
   
  +
=== Mirage and XAPI projects ===
  +
There are separate wiki pages about XCP and XAPI related projects. Make sure you check these out aswell!
   
 
{{project
 
{{project
|Project=Time accounting for stolen ticks.
+
|Project=Create a tiny VM for easy load testing
|Date=Sep 1 2012
+
|Date=01/30/2013
|Contact=Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
+
|Contact=Dave Scott <''first.last''@citrix.com>
  +
|Difficulty=Medium
|Desc=
 
  +
|Skills=OCaml
  +
|Desc=The http://www.openmirage.org/ framework can be used to create tiny 'exokernels': entire software stacks which run directly on the xen hypervisor. These VMs have such a small memory footprint (16 MiB or less) that many of them can be run even on relatively small hosts. The goal of this project is to create a specific 'exokernel' that can be configured to generate a specific I/O pattern, and to create configurations that mimic the boot sequence of Linux and Windows guests. The resulting exokernel will then enable cheap system load testing.
   
  +
The first task is to generate an I/O trace from a VM. For this we could use 'xen-disk', a userspace Mirage application which acts as a block backend for xen guests (see http://openmirage.org/wiki/xen-synthesize-virtual-disk). Following the wiki instructions we could modify a 'file' backend to log the request timestamps, offsets, buffer lengths.
This is http://git.kernel.org/?p=linux/kernel/git/konrad/xen.git;a=shortlog;h=refs/heads/devel/pvtime.v1.1
 
and whether those patches are the right way or the bad way.
 
   
  +
The second task is to create a simple kernel based on one of the MirageOS examples (see http://github.com/mirage/mirage-skeleton). The 'basic_block' example shows how reads and writes are done. The previously-generated log could be statically compiled into the kernel and executed to generate load.
The discussion of this is at http://lists.xen.org/archives/html/xen-devel/2011-10/msg01477.html
 
  +
|Outcomes=1. a repository containing an 'exokernel' (see http://github.com/mirage/mirage-skeleton)
}}
 
  +
2. at least 2 I/O traces, one for Windows boot and one for Linux boot (any version)
  +
|GSoC=yes}}
   
=== Xen Cloud Platform (XCP) and XAPI projects ===
 
There are separate wiki pages about XCP and XAPI related projects. Make sure you check these out aswell!
 
   
 
{{project
 
{{project
Line 605: Line 290:
 
|Date=28/11/2012
 
|Date=28/11/2012
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
  +
|Skills=OCaml
  +
|Difficulty=medium
 
|Desc=
 
|Desc=
 
MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. We would like to use the Mirage/Xen libraries to fuzz test all levels of a typical cloud toolstack. Mirage has low-level bindings for Xen hypercalls, mid-level bindings for domain management, and high-level bindings to XCP for cluster management. This project would build a QuickCheck-style fuzzing mechanism that would perform millions of random operations against a real cluster, and identify bugs with useful backtraces.
 
MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. We would like to use the Mirage/Xen libraries to fuzz test all levels of a typical cloud toolstack. Mirage has low-level bindings for Xen hypercalls, mid-level bindings for domain management, and high-level bindings to XCP for cluster management. This project would build a QuickCheck-style fuzzing mechanism that would perform millions of random operations against a real cluster, and identify bugs with useful backtraces.
   
  +
The first task would be to become familiar with a specification-based testing tool like Kaputt (see http://kaputt.x9c.fr/). The second task would be to choose an interface for testing; perhaps one of the hypercall ones.
 
[[GSoC_2013#fuzz-testing-mirage]]
 
[[GSoC_2013#fuzz-testing-mirage]]
  +
|Outcomes=1. a repo containing a fuzz testing tool; 2. some unexpected behaviour with a backtrace (NB it's not required that we find a critical bug, we just need to show the approach works)
}}
 
  +
|GSoC=yes
 
{{project
 
|Project=Mirage OS XCP/Xen support
 
|Date=28/11/2012
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
 
|Desc=
 
MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening kernel. We would like to explore several strategies for distributed computation using these microkernels, and this project covers writing API bindings for XCP in OCaml. Optionally, we would also like bindings to popular cloud APIs such as Openstack and Amazon EC2.
 
 
These API bindings can be used to provide operating-system-level abstractions to the exokernels. For example, hotplugging a vCPU would perform a "VM create" at the XCP level, and make the extra process known to the Mirage runtime so that it can be scheduled for computation. We should be able to spin up 1000s of "CPUs" by using such APIs in a cluster environment.
 
 
}}
 
}}
   
Line 625: Line 305:
 
|Date=28/11/2012
 
|Date=28/11/2012
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
  +
|Difficulty=hard
  +
|Skills=OCaml
 
|Desc=
 
|Desc=
 
MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. An interesting consequence of programming Mirage applications in a functional language is that the device drivers can be substituted with emulated equivalents. Therefore, it should be possible to test an application under extreme load conditions as a simulation, and then recompile the *same* code into production. The simulation can inject faults and test data structures under distributed conditions, but using a fraction of the resources required for a real deployment.
 
MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. An interesting consequence of programming Mirage applications in a functional language is that the device drivers can be substituted with emulated equivalents. Therefore, it should be possible to test an application under extreme load conditions as a simulation, and then recompile the *same* code into production. The simulation can inject faults and test data structures under distributed conditions, but using a fraction of the resources required for a real deployment.
  +
  +
The first task is to familiarise yourself with a typical Mirage application, I suggest a webserver (see https://github.com/mirage/mirage-www). The second task is to replace the ethernet driver with a synthetic equivalent, so we can feed it simulated traffic. Third, we should inject simulated web traffic (recorded from a real session) and attempt to determine how the application response time varies with load (number of connections; incoming packet rate).
   
 
This project will require a solid grasp of distributed protocols, and functional programming. Okasaki's book will be a useful resource...
 
This project will require a solid grasp of distributed protocols, and functional programming. Okasaki's book will be a useful resource...
  +
|Outcomes=1. a repo/branch with a fake ethernet device and a traffic simulator; 2. an interesting performance graph
  +
|GSoC=no, too much work
 
}}
 
}}
   
Line 635: Line 321:
 
|Date=28/11/2012
 
|Date=28/11/2012
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
 
|Contact=Anil Madhavapeddy <anil@recoil.org>
  +
|Difficulty=hard
  +
|Skills=OCaml, Haskell, Java
 
|Desc=
 
|Desc=
 
There are several languages available that compile directly to Xen microkernels, instead of running under an intervening guest OS. We're dubbing such specialised binaries as "unikernels". Examples include:
 
There are several languages available that compile directly to Xen microkernels, instead of running under an intervening guest OS. We're dubbing such specialised binaries as "unikernels". Examples include:
Line 641: Line 329:
 
* Haskell: HalVM https://github.com/GaloisInc/HaLVM#readme
 
* Haskell: HalVM https://github.com/GaloisInc/HaLVM#readme
 
* Erlang: ErlangOnXen http://erlangonxen.org
 
* Erlang: ErlangOnXen http://erlangonxen.org
* Java: GuestVM http://labs.oracle.com/projects/guestvm/
+
* Java: GuestVM http://labs.oracle.com/projects/guestvm/, OSv https://github.com/cloudius-systems/osv
   
 
Each of these is in a different state of reliability and usability. We would like to survey all of them, build some common representative benchmarks to evaluate them, and build a common toolchain based on XCP that will make it easier to share code across such efforts. This project will require a reasonable grasp of several programming languages and runtimes, and should be an excellent project to learn more about the innards of popular languages.
 
Each of these is in a different state of reliability and usability. We would like to survey all of them, build some common representative benchmarks to evaluate them, and build a common toolchain based on XCP that will make it easier to share code across such efforts. This project will require a reasonable grasp of several programming languages and runtimes, and should be an excellent project to learn more about the innards of popular languages.
  +
  +
[[GSoC_2013#unikernel-substrate]]
  +
|Outcomes=1. a repo containing a common library of low-level functions; 2. a proof of concept port of at least 2 systems to this new library
  +
|GSoC=no, too difficult
 
}}
 
}}
   
Line 658: Line 350:
 
}}
 
}}
   
  +
<!--
{{project
 
|Project= Expose counters for additional aspects of system performance in XCP
 
|Date=01/30/2013
 
|Contact=Jonathan Davies <''first.last''@citrix.com>
 
|Difficulty=Low
 
|Skills=Basic familiarity with administration of a Xen and/or Linux host
 
|Desc=XCP stores performance data persistently in round robin databases (RRDs). Presently, XCP only exposes a few aspects of system performance through the RRD mechanism, e.g. vCPU and pCPU utilisation, VM memory size, host network throughput.
 
 
The XCP RRD daemon (xcp-rrdd) provides a plugin interface to allow other processes to provide data sources. In principle, these plugins can be written in any language by using the XML-RPC/JSON interface (although presently bindings only exist for OCaml).
 
 
The project:
 
Create plugins that expose additional information, including things like:
 
* total amount of CPU cycles used by each VM
 
* VM- or VBD/VIF-level disk and network throughput
 
* number of event channels consumed per domain
 
* how much work is each VM demanding of qemu, netback, blkback, xenstored?
 
* perhaps other statistics only currently easily obtainable via xentrace
 
|Outcomes=A set of plugins is authored in a suitable language and demonstrated to work in XCP. The code is submitted to the XCP project on github.
 
|GSoC=yes}}
 
 
{{project
 
|Project=Add support for XCP performance counters to be sampled at varying rates
 
|Date=01/30/2013
 
|Contact=Jonathan Davies <''first.last''@citrix.com>
 
|Difficulty=Medium
 
|Skills=xcp-rrdd is coded in OCaml, so familiarity with this language would be helpful
 
|Desc=XCP's RRD daemon (xcp-rrdd) stores performance data persistently in 'round robin databases' (RRDs). Each of these is a fixed size structure containing data at multiple resolutions. 'Data sources' are sampled at five-second intervals and points are added to the highest resolution RRD. Periodically each high-frequency RRD is 'consolidated' (e.g. averaged) to produce a data point for a lower-frequency RRD. In this way, data for a long period of time can be stored in a space-efficient manner, with the older data being lower in resolution than more recent data.
 
 
However, some data sources change very slowly (e.g. CPU temperature, available disk capacity). So it is overkill to sample them every five seconds. This becomes a problem when it is costly to sample them, perhaps because it involves a CPU-intensive computation or disk activity.
 
 
The RRD daemon provides a plugin interface to allow other processes to provide data sources.
 
 
The project goal is to generalise the RRD daemon's data-source sampling mechanism to allow it to sample data sources at different frequencies. Extend the plugin interface to allow plugins to suggest the frequency at which they are sampled.
 
|Outcomes=A mechanism is defined and code produced to meet the project goals. The code is submitted to the XCP project on github.
 
|GSoC=yes}}
 
 
{{project
 
|Project=XCP backend to Juju/Chef/Puppet/Vagrant
 
|Date=01/30/2013
 
|Contact=Jonathan Ludlam <''first.last''@citrix.com>
 
|Difficulty=Medium to small
 
|Skills=
 
|Desc=Juju, chef and puppet are all tools that are used to provision and configure virtual and in some cases physical machines. They all have pluggable backends and can target many cloud providers APIs, but none of them currently target the Xen API.
 
|Outcomes=A new backend for one or more of these, able to install and configure virtual machines on a machine running XCP.
 
|GSoC=yes}}
 
 
{{project
 
|Project=RBD (Ceph) client support in XCP
 
|Date=01/30/2013
 
|Contact=James Bulpin <''first.last''@citrix.com>
 
|Difficulty=Medium
 
|Skills=C and Python
 
|Desc=The Ceph distributed storage system allows objects to be stored in a distributed fashion over a number of storage nodes (as opposed to using a centralised storage server). Ceph provides for a block device abstraction (RBD); clients currently exist for Linux (exposing the device as a kernel block device) and qemu (providing a backend for emulated/virtualised virtual disks). It is desirable to have support for RBD in XCP to allow VMs to have virtual disks in a Ceph distributed storage system and therefore to allow VMs to be migrated between hosts without the need for centralised storage. Although it is possible to use the Linux kernel RBD client to do this the scalability is limited and there is no integrated way to manage the creation/destruction and attachment/detachment of RBDs. Ceph provide a user-space client library which could be used by XCP’s tapdisk program (this handles virtual disk read and write requests); a driver (Python script) for XCP’s storage manager could be written to manage the creation/destruction and attachment/detachment of RBDs.
 
|Outcomes= A tapdisk driver (userspace C) would be written to act as a datapath interface between XCP’s block backend and a Ceph distributed store. It is likely that a reasonable amount of this work could be performed by porting similar support from qemu. A storage manager driver (Python) would be written to activate/deactivate RBDs on demand and create and destroy them in response to XCP API calls. A Ceph distributed store would be represented by a XCP “storage repository” with a XCP “VDI” being implemented by a RBD. The end result would be that XCP VMs could run using Ceph RBDs as the backing for their virtual disks. All code would be submitted to the xen.org XCP tapdisk (blktap) and SM projects.
 
|GSoC=yes}}
 
 
{{project
 
|Project=Add connection tracking capability to the Linux OVS
 
|Date=01/30/2013
 
|Contact=Mike Bursell <''first.last''@citrix.com>
 
|Difficulty=Medium
 
|Skills=C, networking concepts, OpenFlow useful (but not essential upfront)
 
|Desc=The open-vswitch (OVS) currently has no concept of connections - only flows. One piece of functionality which it would be interesting to create would be to allow incoming flows for existing outgoing flows in order, for instance, to allow telnet or HTTP connections. This would require an add-on which would act as proxy between an OpenFlow controller and the OVS instance. It would intercept requests for flow rules for incoming flows, match against existing outgoing flows, and, based on simple rules, decide whether to set up a relevant OpenFlow rule in the OVS. This is comparable to iptables’ “RELATED/ESTABLISHED” state matching. If time were short, a simpler, non-proxying version for use without a Controller would still be useful.
 
|Outcomes=A daemon which would bind to the local OVS OpenFlow interface and perform the following work:
 
* parse OpenFlow requests
 
* read OVS flow table(s)
 
* optionally maintain a whitelist of acceptable port/IP address tuples
 
* write OVS flow rules
 
* optionally act as a proxy between an OVS instance and an OpenFlow Controller, relaying requests which it has decided not to parse out to the Controller, and the results back to the OVS instance.
 
 
Code would be submitted to the openvswitch project.
 
|GSoC=yes}}
 
 
 
* XCP and XAPI development projects: [[XAPI project suggestions]]
 
* XCP and XAPI development projects: [[XAPI project suggestions]]
 
* XCP short-term roadmap: [[XCP short term roadmap]]
 
* XCP short-term roadmap: [[XCP short term roadmap]]
 
* XCP monthly developer meetings: [[XCP Monthly Meetings]]
 
* XCP monthly developer meetings: [[XCP Monthly Meetings]]
  +
-->
 
* XAPI developer guide: [[XAPI Developer Guide]]
 
* XAPI developer guide: [[XAPI Developer Guide]]
 
=== Xen.org testing system ===
 
{{project
 
|Project=Testing PV and HVM installs of Debian using debian-installer
 
|Date=2013-01-23
 
|Contact=Ian Jackson <ian.jackson@eu.citrix.com>
 
|Desc=
 
The testing system "osstest" which is used for the push gate for the xen and related trees should have Debian PV and HVM guest installations, based on the standard Debian installer, in its repertoire. Also it currently always tests
 
kernels as host and guest in the same installation.
 
* Task 1: Generalise the functions in osstest which generate debian-installer preseed files and manage the installation, to teach them how to set up PV and HVM guests, and provide an appropriate ts-* invocation script.
 
* Task 2: Extend the guest installer from task 1 to be able to install a kernel other than the one which comes from the Debian repository, so that it is possible to test one kernel as host with a different specified kernel as guest.
 
* Task 3: Determine which combinations of kernel branches should be added to the test schedules, push gates, etc. and write this up in a report for deployment by the infrastructure maintainers.
 
* More information: See xen-devel test reports. Code is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=summary
 
 
[[GSoC_2013#debian-installer]]
 
|GSoC=Yes
 
}}
 
 
{{project
 
|Project=Testing NetBSD
 
|Date=2013-01-23
 
|Contact=Ian Jackson <ian.jackson@eu.citrix.com>
 
|Desc=
 
The testing system "osstest" which is used for the push gate for the xen and related trees should be able to test NetBSD both as host and guest.
 
* Task 1: Understand how best to automate installation of NetBSD. Write code in osstest which is able to automatically and noninteractively install NetBSD on a bare host.
 
* Task 2: Test and debug osstest's automatic building arrangements so that they can correctly build Xen on NetBSD.
 
* Task 3: Write code in osstest which can automatically install the Xen from task 2 on the system installed by task 1.
 
* Task 4: Debug at least one of the guest installation capabilities in osstest so that it works on the Xen system from task 3.
 
* Task 5: Rework the code from task 1 so that it can also install a NetBSD guest, ideally either as a guest of a Linux dom0 or of a NetBSD dom0.
 
* Task 6: Determine which versions of NetBSD and of Linux should be tested in which combinations and write this up in a report for deployment by the infrastructure maintainers.
 
* More information: See xen-devel test reports. Code is at http://xenbits.xen.org/gitweb/?p=osstest.git;a=summary
 
 
[[GSoC_2013#testing-netbsd]]
 
|GSoC=Yes
 
}}
 
   
 
== Quick links to changelogs of the various Xen related repositories/trees ==
 
== Quick links to changelogs of the various Xen related repositories/trees ==
 
Please see [[XenRepositories]] wiki page!
 
Please see [[XenRepositories]] wiki page!
   
[[Category:XCP]]
+
[[Category:Archived]]
 
[[Category:Xen]]
 
[[Category:Xen]]
[[Category:Xen 4.3]]
+
[[Category:Xen 4.4]]
 
[[Category:PVOPS]]
 
[[Category:PVOPS]]
 
[[Category:Developers]]
 
[[Category:Developers]]

Latest revision as of 19:04, 18 February 2016

This page lists various Xen related development projects that can be picked up by anyone! If you're interesting in hacking Xen this is the place to start! Ready for the challenge?

To work on a project:

  • Find a project that looks interesting (or a bug if you want to start with something simple)
  • Send an email to xen-devel mailinglist and let us know you started working on a specific project.
  • Post your ideas, questions, RFCs to xen-devel sooner than later so you can get comments and feedback.
  • Send patches to xen-devel early for review so you can get feedback and be sure you're going into correct direction.
  • Your work should be based on xen-unstable development tree, if it's Xen and/or tools related. After your patch has been merged to xen-unstable it can be backported to stable branches (Xen 4.2, Xen 4.1, etc).
  • Your kernel related patches should be based on upstream kernel.org Linux git tree (latest version).

xen-devel mailinglist subscription and archives: http://lists.xensource.com/mailman/listinfo/xen-devel

Before to submit patches, please look at Submitting Xen Patches wiki page.

If you have new ideas, suggestions or development plans let us know and we'll update this list!

List of projects

Domain support

Utilize Intel QuickPath on network and block path.

Date of insert: 01/22/2013; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: High
Skills Needed: The basic requirement for this project is Linux kernel programming skill. The candidate for this project should be familiar with open source development workflow as it may require collaboration with several parties.
Description: The Intel QuickPath, also known as Direct Cache Access, is the chipset that sits in the PCIe subsystem in the Intel systems. It allows the PCIe subsystem to tag which PCIe writes to memory should reside in the Last Level Cache (LLC, also known as L3, which in some cases can be of 15MB or 2.5MB per CPU). This offers incredible boosts of speed - as we bypass the DIMMs and instead the CPU can process the data in the cache. Adding this component in the network or block backends can mean that we can keep the data longer in the cache and the guest can process the data right off the cache.
Outcomes: Expected outcome:
  • Have upstream patches.
  • benchmark report of with and without.


Enabling the 9P File System transport as a paravirt device

Date of insert: 01/20/2014; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Required skills include knowledge of kernel hacking, file system internals. Desired skills include: understanding of Xen PV driver structure, and VirtIO.
Description: VirtIO provides a 9P FS transport, which is essentially a paravirt file system device. VMs can mount arbitrary file system hierarchies exposed by the backend. The 9P FS specification has been around for a while, while the VirtIO transport is relatively new. The project would consist of implementing a classic Xen front/back pv driver pair to provide a transport for the 9P FS Protocol.
Outcomes: Expected outcome:
  • LKML patches for front and back end drivers.
  • In particular, domain should be able to boot from the 9P FS.


OVMF Compatibility Support Module support in Xen

Date of insert: 2/5/2014; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Wei Liu <wei.liu2@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Easy
Skills Needed: Unknown
Description: Currently Xen supports booting HVM guest with Seabios and OVMF UEFI firmware, but those are separate binaries. OVMF supports adding legacy BIOS blob in its binary with Compatibility Support Module support. We can try to produce single OVMF binary with Seabios in it, thus having only one firmware binary.

Tasks may include:

  • figure out how CSM works
  • design / implement interface between Hvmloader and the unified binary
Outcomes: Not specified, project outcomes


Improvements to firmware handling HVM guests

Date of insert: 07/16/2015; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Andrew Cooper <andrew.cooper3@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Easy
Skills Needed: Unknown
Description: Currently, all firmware is compiled into HVMLoader.

This works, but is awkward when used using a single distro seabios/ovmf designed for general use. In such a case, any time an update to seabios/ovmf happens, hvmloader must be rebuilt.

The purpose of this project is to alter hvmloader to take firmware blobs as a multiboot module rather than requiring them to be built in. This reduces the burden of looking after Xen in a distro environment, and will also be useful for developers wanting to work with multiple versions of firmware.

As an extension, support loading an OVMF NVRAM blob. This enabled EFI NVRAM support for guests.
Outcomes: Not specified, project outcomes

Hypervisor

Introducing PowerClamp-like driver for Xen

Date of insert: 01/22/2013; Verified: Not updated in 2020; GSoC: Yes
Technical contact: George Dunlap <george.dunlap@eu.citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: PowerClamp was introduced to Linux in late 2012 in order to allow users to set a system-wide maximum

power usage limit. This is particularly useful for data centers, where there may be a need to reduce power consumption based on availability of electricity or cooling. A more complete writeup is available at LWN.

These same arguments apply to Xen. The purpose of this project would be to implement a similar functionality in Xen, and to make it interface as well as possible with the Linux PowerClamp tools, so that the same tools could be used

for both. GSoC_2013#powerclamp-for-xen
Outcomes: Not specified, project outcomes


Integrating NUMA and Tmem

Date of insert: 08/08/2012; Verified: Not updated in 2020; GSoC: Unknown
Technical contact: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>, Dario Faggioli <dario.faggioli@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: NUMA (Non-Uniform Memory Access) systems are advanced server platforms, comprising multiple nodes. Each node contains processors and memory. An advanced memory controller allows a node to use memory from all other nodes, but that happens, data transfer is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, hence the name.

Trascendent memory (Tmem) can be seen as a mechanism for discriminating between frequently and infrequently used data, and thus helping allocating them properly. It would be interesting to investigate and implement all the necessary mechanisms to take advantage of this and improve performances of Tmem enabled guests running on NUMA machines.

For instance, implementing something like alloc_page_on_any_node_but_the_current_one() (or any_node_except_this_guests_node_set() for multinode guests), and have Xen's Tmem implementation use it (especially in combination with selfballooning), could solve a significant part of the NUMA problem when running Tmem-enabled guests.
Outcomes: Not specified, project outcomes

Userspace Tools

Refactor Linux hotplug scripts

Date of insert: 15/11/2012; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Roger Pau Monné <roger.pau@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: Current Linux hotplug scripts are all entangled, which makes them really difficult to understand or modify. The reason of hotplug scripts is to give end-users the chance to "easily" support different configuration for Xen devices.

Linux hotplug scripts should be analized, providing a good description of what each hotplug script is doing. After this, scripts should be cleaned, putting common pieces of code in shared files across all scripts. A Coding style should be applied to all of them when the refactoring is finished.

GSoC_2013#linux-hotplug-scripts
Outcomes: Not specified, project outcomes


XL to XCP VM motion

Date of insert: 15/11/12; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Ian Campbell
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: Currently xl (the toolstack supplied alongside Xen) and xapi (the XCP toolstack) have very different concepts about domain configuration, disk image storage etc. In the XCP model domain configuration is persistent and stored in a data base while under xl domain configuration is written in configuration files. Likewise disk images are stored as VDIs in Storage Repositories while under xl disk images are simply files or devices in the dom0 filesystem. For more information on xl see XL. For more information on XCP see XCP Overview.

This project is to produce one or more command-line tools which support migrating VMs between these toolstacks.

One tool should be provided which takes an xl configuration file and details of an XCP pool. Using the XenAPI XML/RPC interface It should create a VM in the pool with a close approximation of the same configuration and stream the configured disk image into a selected Storage Repository.

A second tool should be provided which performs the opposite operation, i.e. give a reference to a VM residing in an XCP pool it should produce an XL compatible configuration file and stream the disk image(s) our of Xapi into a suitable format.

These tools could be reasonably bundled as part of either toolstack and by implication could be written in either C, Ocaml or some other suitable language.

The tool need not operate on a live VM but that could be considered a stretch goal.

An acceptable alternative to the proposed implementation would be to implement a tool which converts between a commonly used VM container format which is supported by XCP (perhaps OVF or similar) and the xl toolstack configuration file and disk image formats.

GSoC_2013#xl-to-xcp-vm-motion
Outcomes: Not specified, project outcomes


Allowing guests to boot with a passed-through GPU as the primary display

Date of insert: 01/22/2013; Verified: Not updated in 2020; GSoC: Yes
Technical contact: George Dunlap <george.dunlap@eu.citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: One of the primary drivers of Xen in the "consumer market" of the open-source world is the ability to

pass through GPUs to guests -- allowing people to run Linux as their main desktop but easily play games requiring proprietary operating systems without rebooting.

GPUs can be easily passed through to guests as secondary displays, but as of yet cannot be passed through as primary displays. The main reason is the lack of ability to load the VGA BIOS from the card into the guest.

The purpose of this project would be to allow HVM guests to load the physical card's VGA bios, so that the guest can boot with it as the primary display.

GSoC_2013#gpu-passthrough
Outcomes: Not specified, project outcomes


Advanced Scheduling Parameters

Date of insert: 01/22/2013; Verified: Not updated in 2020; GSoC: Yes
Technical contact: George Dunlap <george.dunlap@eu.citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: The credit scheduler provides a range of "knobs" to control guest behavior, including CPU weight and caps. However,

a number of users have requested the ability to encode more advanced scheduling logic. For instance, "Let this VM max out for 5 minutes out of any given hour; but after that, impose a cap of 20%, so that even if the system is idle he can't an unlimited amount of CPU power without paying for a higher level of service."

This is too coarse-grained to do inside the hypervisor; a user-space tool would be sufficient. The goal of this project would

be to come up with a good way for admins to support these kinds of complex policies in a simple and robust way.
Outcomes: Not specified, project outcomes


CPU/RAM/PCI diagram tool

Date of insert: 01/30/2013; Verified: Not updated in 2020; GSoC: yes
Technical contact: Andy Cooper <andrew.cooper3@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Low to medium
Skills Needed: Linux scripting; basic understanding of PC server hardware
Description: It is often useful in debugging kernel, hypervisor or performance problems to understand the bus topology of a server. This project will create a layout diagram for a server automatically using data from ACPI Tables, SMBios Tables, lspci output etc. This tool would be useful in general Linux environments including Xen and KVM based virtualisation systems. There are many avenues for extension such as labelling relevant hardware errata, performing bus throughput calculations etc.
Outcomes: A tool is created that can either run on a live Linux system or offline using captured data to produce a graphical representation of the hardware topology of the system including bus topology, hardware device locations, memory bank locations, etc. The tool would be submitted to a suitable open-source project such as the Xen hypervisor project or XCP.


KDD (Windows Debugger Stub) enhancements

Date of insert: 01/30/2013; Verified: Not updated in 2020; GSoC: yes
Technical contact: Paul Durrant <paul.durrant@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Medium
Skills Needed: C, Kernel Debuggers, Xen, Windows
Description: kdd is a Windows Debugger Stub for Xen hypervisor. It is OSS found under http://xenbits.xen.org/hg/xen-unstable.hg/tools/debugger/kdd

kdd allows you to debug a running Windows virtual machine on Xen using standard Windows kernel debugging tools like WinDbg. kdd is an external debugger stub for the windows kernel. Windows can be debugged without enabling the debugger stub inside windows kernel by using kdd. This is important for debugging hard to reproduce problems on Windows virtual machines that may not have debugging enabled.

Expected Results:

  1. Add support for Windows 8 (x86, x64) to kdd
  2. Add support for Windows Server 2012 to kdd
  3. Enhance kdd to allow WinDbg to write out usable Windows memory dumps (via .dump debugger extension) for all supported versions
  4. Produce a user guide for kdd on Xen wiki page
Nice to have: Allow kdd to operate on a Windows domain checkpoint file (output of xl save for e.g.)
Outcomes: Code is submitted to xen-devel@xen.org for inclusion in the xen-unstable project.


Lazy restore using memory paging

Date of insert: 01/20/2014; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: A good understanding of save/restore, and virtualized memory management (e.g. EPT, shadow page tables, etc). In principle the entire project can be implemented in user-space C code, but it may be the case that new hypercalls are needed for performance reasons.
Description: VM Save/restore results in a boatload of IO and non-trivial downtime as the entire memory footprint of a VM is read from IO.

Xen memory paging support in x86 is now mature enough to allow for lazy restore, whereby the footprint of a VM is backfilled while the VM executes. If the VM hits a page not yet present, it is eagerly paged in.

There has been some concern recently about the lack of docs and/or mature tools that use xen-paging. This is a good way to address the problem.
Outcomes: Expected outcome:
  • Mainline patches for libxc and libxl


CPUID Programming for Humans

Date of insert: 02/04/2014; Verified: Not updated in 2020; GSoC: Yes
Technical contact: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: A good understanding of C user-land programming, and the ability to dive into qemu/libvirt (for reference code and integration), as well as libxc and libxl (for implementation).
Description: When creating a VM, a policy is applied to mask certain CPUID features. Right now it's black magic.

THe KVM stack has done an excellent job of making this human-useable, and understandable.

For example, in a qemu-kvm command-line you may encounter:

-cpu SandyBridge,+pdpe1gb,+osxsave,+dca,+pcid,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme

And in <qemu>/target-i386.c you find a fairly comprehensive description of x86 processor models, what CPUID features are inherent, and what CPUID feature each of these symbolic flags enables.

In the Xen world, there is a libxc interface to do the same, although it's all hex and register driven. It's effective, yet horrible to use.

An ideal outcome would have libxl config files and command line absorb a similarly human-friendly description of the CPUID features a user wishes for the VM, and interface appropriately with libxl. Further, autodetection of best CPUID shuold yield a human-readable output to be able to easily understand what the VM thinks about its processor.

Finally, interfacing with libvirt should be carefully considered.

CPUID management is crucial in a heterogeneous cluster where migrations and save restore require careful processor feature selection to avoid blow-ups.
Outcomes: Expected outcome:
  • Mainline patches for libxl

Mirage and XAPI projects

There are separate wiki pages about XCP and XAPI related projects. Make sure you check these out aswell!


Create a tiny VM for easy load testing

Date of insert: 01/30/2013; Verified: Not updated in 2020; GSoC: yes
Technical contact: Dave Scott <first.last@citrix.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Medium
Skills Needed: OCaml
Description: The http://www.openmirage.org/ framework can be used to create tiny 'exokernels': entire software stacks which run directly on the xen hypervisor. These VMs have such a small memory footprint (16 MiB or less) that many of them can be run even on relatively small hosts. The goal of this project is to create a specific 'exokernel' that can be configured to generate a specific I/O pattern, and to create configurations that mimic the boot sequence of Linux and Windows guests. The resulting exokernel will then enable cheap system load testing.

The first task is to generate an I/O trace from a VM. For this we could use 'xen-disk', a userspace Mirage application which acts as a block backend for xen guests (see http://openmirage.org/wiki/xen-synthesize-virtual-disk). Following the wiki instructions we could modify a 'file' backend to log the request timestamps, offsets, buffer lengths.

The second task is to create a simple kernel based on one of the MirageOS examples (see http://github.com/mirage/mirage-skeleton). The 'basic_block' example shows how reads and writes are done. The previously-generated log could be statically compiled into the kernel and executed to generate load.
Outcomes: 1. a repository containing an 'exokernel' (see http://github.com/mirage/mirage-skeleton) 2. at least 2 I/O traces, one for Windows boot and one for Linux boot (any version)


Fuzz testing Xen with Mirage

Date of insert: 28/11/2012; Verified: Not updated in 2020; GSoC: yes
Technical contact: Anil Madhavapeddy <anil@recoil.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: medium
Skills Needed: OCaml
Description: MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. We would like to use the Mirage/Xen libraries to fuzz test all levels of a typical cloud toolstack. Mirage has low-level bindings for Xen hypercalls, mid-level bindings for domain management, and high-level bindings to XCP for cluster management. This project would build a QuickCheck-style fuzzing mechanism that would perform millions of random operations against a real cluster, and identify bugs with useful backtraces.

The first task would be to become familiar with a specification-based testing tool like Kaputt (see http://kaputt.x9c.fr/). The second task would be to choose an interface for testing; perhaps one of the hypercall ones.

GSoC_2013#fuzz-testing-mirage
Outcomes: 1. a repo containing a fuzz testing tool; 2. some unexpected behaviour with a backtrace (NB it's not required that we find a critical bug, we just need to show the approach works)


From simulation to emulation to production: self-scaling apps

Date of insert: 28/11/2012; Verified: Not updated in 2020; GSoC: no, too much work
Technical contact: Anil Madhavapeddy <anil@recoil.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: hard
Skills Needed: OCaml
Description: MirageOS (http://openmirage.org) is a type-safe exokernel written in OCaml which generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening guest kernel. An interesting consequence of programming Mirage applications in a functional language is that the device drivers can be substituted with emulated equivalents. Therefore, it should be possible to test an application under extreme load conditions as a simulation, and then recompile the *same* code into production. The simulation can inject faults and test data structures under distributed conditions, but using a fraction of the resources required for a real deployment.

The first task is to familiarise yourself with a typical Mirage application, I suggest a webserver (see https://github.com/mirage/mirage-www). The second task is to replace the ethernet driver with a synthetic equivalent, so we can feed it simulated traffic. Third, we should inject simulated web traffic (recorded from a real session) and attempt to determine how the application response time varies with load (number of connections; incoming packet rate).

This project will require a solid grasp of distributed protocols, and functional programming. Okasaki's book will be a useful resource...
Outcomes: 1. a repo/branch with a fake ethernet device and a traffic simulator; 2. an interesting performance graph


Towards a multi-language unikernel substrate for Xen

Date of insert: 28/11/2012; Verified: Not updated in 2020; GSoC: no, too difficult
Technical contact: Anil Madhavapeddy <anil@recoil.org>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: hard
Skills Needed: OCaml, Haskell, Java
Description: There are several languages available that compile directly to Xen microkernels, instead of running under an intervening guest OS. We're dubbing such specialised binaries as "unikernels". Examples include:

Each of these is in a different state of reliability and usability. We would like to survey all of them, build some common representative benchmarks to evaluate them, and build a common toolchain based on XCP that will make it easier to share code across such efforts. This project will require a reasonable grasp of several programming languages and runtimes, and should be an excellent project to learn more about the innards of popular languages.

GSoC_2013#unikernel-substrate
Outcomes: 1. a repo containing a common library of low-level functions; 2. a proof of concept port of at least 2 systems to this new library


DRBD Integration

Date of insert: 07/01/2013; Verified: Not updated in 2020; GSoC: Unknown
Technical contact: John Morris <john@zultron.com>
Mailing list/forum for project: xen-devel@
IRC channel for project: #xen-devel
Difficulty: Unknown
Skills Needed: Unknown
Description: DRBD is potentially a great addition to the other high-availability features in XenAPI. An architecture of as few as two Dom0s with DRBD mirrored local storage is an inexpensive minimal HA configuration enabling live migration of VMs between physical hosts and providing failover in case of disk failure, and eliminates the need for external storage. This setup can be used in small shop or hobbyist environments, or could be used as a basic unit in a much larger scalable architecture.

Existing attempts at integrating DRBD sit below the SM layer and thus do not enable one VBD per DRBD device. They also suffer from a split-brain situation that could be avoided by controlling active/standby status from XenAPI.

DRBD should be implemented as a new SR type on top of LVM. The tools for managing DRBD devices need to be built into storage management, along with the logic for switching the active and standby nodes.
Outcomes: Not specified, project outcomes

Quick links to changelogs of the various Xen related repositories/trees

Please see XenRepositories wiki page!