Xen Maintainer, Committer and Developer Meeting/February 2013 Minutes

From Xen
Jump to: navigation, search

Attendees

  • Ian Campbell
  • Daniel De Graaf
  • Andres Lagar-Cavilla (+ another from Gridcentric?)
  • Boris Ostrovsky
  • Olaf Hering
  • Lars Kurth
  • James Bulpin
  • Jan Beulich
  • Jason Douglas
  • Konrad Rzeszutek Wilk
  • Donald Dugger
  • Jun Nakajima

Apologies

  • Matt Wilson

Mirage Incubation Project

  • Anil Madhavapedy approached Lars before Christmas about the possibility of OpenMirage joining Xen.org as an incubation project
  • Mirage is closely tied to the Xen PV interfaces
  • Project was proposed for community review in January
  • Votes as per announcement
  • Amir Chaudhry from University of Cambridge Computer Lab (Community Role within Mirage) was present and gave an elevator pitch for the project:
    • Mirage is revisiting the library Operating System concept previously explored by the computer lab.
    • Treats the hypervisor as a stable hardware platform, removing the requirement for lots of hardware drivers.
    • Includes reimplementations of TCP, DNS, SSH, HTTP
    • Compiles to a single appliance kernel.
    • http://openmirage.org is self hosting
    • Useful for testing
    • Useful security properties / reduced attack surface
  • Andres: likes this work, thinks Xen.org + community from these sorts of projects, it's win-win
  • James: Good demonstration of Xen's capabilities, already useful for testing in XenServer and the team are interested to use it as part of their disaggregation strategy
  • Konrad: Asked what the implications were for existing Xen.org projects, in terms of maintenance
    • Ian: Each project is run independently as its own sub-project, Xen.org supplies infrastructure (lists, repos, wiki, etc). No requirements for members of existing projects to maintain, e.g. no burden to e.g. the Hypervisor project itself.
    • This is described in the porject proposal
    • Lars was out of the room at this point but Ian confirmed the above with him afterwards. Each project builds its own community and this is a criteria for graduation from incubation.
  • Jun: Asked about openflow support, is Mirrage able to run as an OF controller
    • Amir: Code exists but he is unsure of its status
    • Ian has heard that this works

Previous ACTION items

  • AMD to nominate a new maintainer
    • No one from AMD was present.
  • ACPI ERST handling
    • Boris: Last he heard AMD were unable to find a machine
    • IanC: Spoke to IanJ about BIOSes in the *beetle machines
      • Have not been updated since they were purchased
      • Is an option to do so
    • IanC: Forwarded a request from Sherry regarding the exact nature of the *beetlemachines to IanJ.
    • ACTION: Ping IanJ again about both BIOS upgrade and *beetle specs

(DONE)

  • Performance measurement
    • Konrad started a conversation on xen-devel
    • James has pinged XenServer team to become involved and will encourage them to remain involved.
    • Konrad is looking for people to sign up to implementing specific requirements
    • Boris will produce a list of requirements/TODO items and circulate.

Konrad will work with James to divide up the list.

  • Kexec
    • David and Daniel have been talking on list.
  • Testing for security
    • projects relating to the osstest system are in the GSoC projects list
  • Code contribution stats
    • Lars has circulated

Current Technical Challenges

  • Outstanding MSI-X issue
    • Jan has posted patches to the list outlining a fix for a long standing issue which allows PV guests with passthrough devices unfettered access to the MSI control registers in PCI-CFG space
      • Clarification: The problem is with access to the MSI-X table, living in MMIO space. The (supposedly read-only) Pending Bit Array is similarly affected (in case devices don't actually implement this in a read-only fashion), but since MSI-X table and PBA can share pages, the two need to be (and are being) dealt with together anyway.
    • Lack of ACK or feedback on the approach is concerning
      • The patches require dom0 to inform the hypervisor at the right times when regions are passed through to guests, such that it can ensure they are read only (is this correct???)
        • Clarification: Yes, that's correct. The main aspect here is that no resource re-assignment (IOW changes to the PCI BARs of the affected devices) must occur after Xen got the "going to be in passed through" notification.
      • Jan is not totally convinced by the approach relying on dom0 and is looking for (and not receiving) general feedback
      • Other option is snooping PCI CFG space writes, but this is hard
        • Clarification: All registers of MSI-X capable devices, in fact.
        • Emulating accesses to MSI-X registers in PCI MMIO CFG space
    • Issue is PV only, since HVM guests already go via qemu and do not access real CFG space directly
      • Clarification (Jan): I was slightly mistaken here: Neither PV nor HVM guests access config space directly (both go through pciback). So the issue isn't with CFG accesses, but with the MMIO resource holding the MSI-X table. And this one _is_ being accessed directly by PV guests, whereas HVM ones have the P2M in between (which is where the write permission gets revoked when the first MSI-X interrupt on a device gets set up). As Xen does the corresponding page table modifications for the guest, it can suppress the write access here provided it knows what regions it needs to protect. Thus the need for knowing the regions _before_ the guest starts (and can set up any writable mappings to them).
    • Jan asked how to kickstart a discussion
      • Ian suggested ping and pinging louder, plus private prods to individual who could be expected to have an opinion (e.g. Keir)
      • Intel engineers have previously been involved in the discussion. Thought this issue was done.

        • Pictogram voting comment 15px.png Action Don: to ping the relevant people in Intel to respond to Jan.

        • Pictogram voting comment 15px.png Action James: to ping Andy Cooper.
  • Approach to functional degredation
    • James: The XenServer product team have expressed concerns about the manner in which functional deprecation (e.g. broken h/w features) is handled in the hypervisor (e.g. disabling the IOMMU as part of XSA-36)
      • Would like to see a less static approach than hypervisor command line options, e.g. something exposed by the toolstack
    • IanC: WRT security updates the security teams wants to make changes which are as simple and obvious as possible, which precludes implementing toolstack configuration options as part of a security update. However if the infrastructure was already in place and behaviour could be with a small patch to the hypervisor then he see no reason not to use it.
    • Jan: Some features are hard to enable after boot, or are hard to toggle once enabled
    • Jun: Depending on the feature it may not be desirable to allow control, since attacker could use this to switch things off (e.g. SMEP?). Perhaps settings should be "sticky"
Icon Info.png This is a stub template (aka an undefined template). For more stub templates see Category:Undefined Template.
to work with his team to evaluate previous instances of such issues and propose specific solutions/patches to enable this to be done differently going forward

Current an future work

  • 4.3 Release Status
    • Ian asked Konrad about the status of PVH
      • Konrad's understanding is that it needs review of the h/v side
      • Jan: lots of TODOs etc still present in the code.
      • Konrad: Definitely no e.g. migration support at the moment, and some other missing features on the domU side, but can do dom0 for 4.3
      • Jan: TODOs refer to e.g. implementation issues and not just missing features. This
      • Konrad: Should be able to do dom0 only PVH as a tech preview in 4.3.

      • Pictogram voting comment 15px.png Action Konrad: or Mukesh to communicate this to George
      • Jun asked why PVH was targetting dom0 first and not domU (which logically often comes first)
        • Konrad: The goal of the project was to improve dom0
      • Jan, could consider slipping the release by e.g. 1 month to allow more time for PVH and event channel work.
      • Lars asked what the impact of a 1 month slip would be on PVH => Unknown
    • Event channel scalability work
      • Ian asked the call how important this feature was for 4.3 and asked for guidance on what to do.
      • There are two solutions on the table
        • 3-level event channels by Wei Lui. Extends existing scheme with an extra level. Is mostly implemented
        • FIFO/Queue based design by David Vrabel, has some interesting properties but currently only a design with no implementation
          • There are currently no man hours assigned to implementing this solution.
      • Need to decide between
        • No event channel scalability in 4.3 (wait for 4.4)
        • Take 3-level solution for 4.3, with the possibility to replace by the queue model in a future release
          • This would mean supporting L2 (current scheme) + L3 in 4.3 and L2 + QUEUE in 4.4, does not require supporting L2 + L3 + QUEUE in 4.4
        • Slip to give time implement QUEUE for 4.3
          • Jan proposed that 1 month slippage to allow for this (and PVH) would be OK, but 2 months would be too much. James agreed.
          • It was not clear that even with a slip this would get implemented

      • Pictogram voting comment 15px.png Action James: to discuss with XenServer team about resourcing the implementation of the QUEUE model.

      • Pictogram voting comment 15px.png Action Ian: to discuss with platform team about implementing this

Community News

  • Xen.org events
  • Lars sent out an email about event planning.
    • Has listened to feedback, will be targeting 2 developer focused events per year.
      • Unfortunately we have ended up with 2 events in Europe this year (Hackathon, Dublin, May & XenSumit @ Linux Con Europe, Edinburgh October)
      • Colocated events need to be booked well in advance
      • Going forward will try to alternate continents (Europe, NA, Asia) and build up a pipeline of events
    • User oriented "Xen Days" will continue, training days etc.