Difference between revisions of "RTDS-Based-Scheduler"

From Xen
(Description)
(Features that are planned to be developed after Xen 4.5)
Line 66: Line 66:
 
A global runqueue and a global depletedq for each CPU pool. The runqueue holds all runnable VCPUs with budget and sorted by deadline; The depletedq holds all VCPUs without budget and unsorted.
 
A global runqueue and a global depletedq for each CPU pool. The runqueue holds all runnable VCPUs with budget and sorted by deadline; The depletedq holds all VCPUs without budget and unsorted.
   
=== Features that are planned to be developed after Xen 4.5 ===
+
=== Features developed in Xen 4.7 ===
The following features will be implemented after Xen 4.5:
 
   
*Burn budget in finer granularity instead of 1ms;
+
* (done) Burn budget in finer granularity instead of 1ms;
*Use separate timer per VCPU for each VCPU's budget replenishment, instead of scanning the full runqueue every now and then;
+
* (done) Use separate timer per VCPU for each VCPU's budget replenishment, instead of scanning the full runqueue every now and then;
 
*Handle time stolen from domU by hypervisor. When it runs on a machine with many sockets and lots of cores, the spin-lock for global RunQ used in rtds scheduler could eat up time from domU, which could make domU have less budget than it requires.
 
*Handle time stolen from domU by hypervisor. When it runs on a machine with many sockets and lots of cores, the spin-lock for global RunQ used in rtds scheduler could eat up time from domU, which could make domU have less budget than it requires.
*Toolstack supports assigning/display each VCPU's parameters of each domain.
+
* (done) Toolstack supports assigning/display each VCPU's parameters of each domain.
   
 
===Glossary of Terms===
 
===Glossary of Terms===

Revision as of 15:47, 5 May 2016

Icon Info.png The scheduler has been included as an experimental in Xen 4.5 and is still an in-development feature.


Real-Time-Deferrable-Server(RTDS)-Based CPU Scheduler

Introduction

The Real-Time Deferrable Server (rtds) scheduler is a real-time CPU scheduler built to provide guaranteed CPU capacity to guest VMs on SMP hosts. It is introduced with name rtds in Xen 4.5 as an experimental scheduler. The latest version in Xen 4.7 has been converted from quantum-driven to event-driven model.

Description

Each VCPU of each domain is assigned a budget and a period. The VCPU with <budget>, <period> is supposed to run for <budget>us (not necessarily continuously) in every <period>us.

Note: The VCPUs of the same domain have the same parameters right now.

In Xen 4.7, budget replenishment and enforcement are separated by adding a replenishment timer, which fires at the next most imminent release time of all runnable vcpus.

A replenishment queue has been added to keep track of all replenishment events.

The following functions have major changes to manage the replenishment events and timer.

repl_handler(): It is a timer handler which is re-programmed to fire at the nearest vcpu deadline to replenish vcpus.

rt_schedule(): picks the highest runnable vcpu based on cpu affinity and ret.time will be passed to schedule(). If an idle vcpu is picked, -1 is returned to avoid busy-waiting.

repl_update() has been removed.

rt_vcpu_wake(): when a vcpu wakes up, it tickles instead of picking one from the run queue.

rt_context_saved(): when context switching is finished, the preempted vcpu will be put back into the run queue and it tickles.

Simplified funtional graph:

schedule.c SCHEDULE_SOFTIRQ:

   rt_schedule():
       [spin_lock]
       burn_budget(scurr)
       snext = runq_pick()
       [spin_unlock]

sched_rt.c TIMER_SOFTIRQ

   replenishment_timer_handler()
       [spin_lock]
       <for_each_vcpu_on_q(i)> {
           replenish(i)
       }>
       runq_tickle()
       program_timer()
       [spin_lock]

Usage

The xl sched-rtds command can be used to tune the per VM guest scheduler parameters.

  • xl sched-rtds -d <domain> :List the parameter of the specified <domain>
  • xl sched-rtds -d <domain> -p <period> -b <budget> : Set each VCPU's budget to <budget>us and period to <period>us of the specified <domain>

TO DO: include the new per-vcpu usage

Algorithm

The design of this rtds scheduler is as follows:

Each VCPU has a dedicated period and budget. The deadline of a VCPU is at the end of each period; A VCPU has its budget replenished at the beginning of each period; While scheduled, a VCPU burns its budget. The VCPU needs to finish its budget before its deadline in each period; The VCPU discards its unused budget at the end of each period. If a VCPU runs out of budget in a period, it has to wait until next period.

Each VCPU is implemented as a deferable server. When a VCPU has a task running on it, its budget is continuously burned; When a VCPU has no task but with budget left, its budget is preserved.

This scheduler follows the Preemptive Global Earliest Deadline First (EDF) theory in real-time field to schedule these VCPUs. At any scheduling point, the VCPU with earlier deadline has higher priority. The scheduler always picks the highest priority VCPU to run on a feasible PCPU. A PCPU is feasible to a VCPU if the PCPU is idle or has a lower-priority VCPU running on it.

Queue scheme: A global runqueue and a global depletedq for each CPU pool. The runqueue holds all runnable VCPUs with budget and sorted by deadline; The depletedq holds all VCPUs without budget and unsorted.

Features developed in Xen 4.7

  • (done) Burn budget in finer granularity instead of 1ms;
  • (done) Use separate timer per VCPU for each VCPU's budget replenishment, instead of scanning the full runqueue every now and then;
  • Handle time stolen from domU by hypervisor. When it runs on a machine with many sockets and lots of cores, the spin-lock for global RunQ used in rtds scheduler could eat up time from domU, which could make domU have less budget than it requires.
  • (done) Toolstack supports assigning/display each VCPU's parameters of each domain.

Glossary of Terms

  • us: microsecond
  • Host: The physical hardware running Xen and hosting guest VMs.
  • VM/domU: Guest virtual machine.
  • VCPU: Virtual CPU (one or more per VM).
  • CPU/PCPU: Physical host CPU.
  • Period: The period when a VCPU's budget is replenished or discarded.
  • Budget: The amount of time a VCPU can execute within its period.

Also See