Difference between revisions of "Xen Project Schedulers"

From Xen
m (Also See)
(RTDS is still marked experimental in xen/common/sched_rt.c (HEAD))
 
(19 intermediate revisions by 3 users not shown)
Line 1: Line 1:
  +
__TOC__
<!-- MoinMoin name: Scheduling -->
 
  +
= Overview =
<!-- Comment: -->
 
  +
The Xen Project Hypervisor supports several different virtual CPU schedulers, with different properties.
<!-- WikiMedia name: Scheduling -->
 
<!-- Page revision: 00000017 -->
 
<!-- Original date: Sat Jun 9 18:29:49 2007 (1181413789000000) -->
 
   
  +
The job of an hypervisor's scheduler is to decide, among all the various vCPUs of the various virtual machines, which ones should run on the host's physical CPUs (pCPUs), at any given time.
{{TODO|This document gives an overview of Xen scheduling, but is likely out-of-date}}
 
 
__NOTOC__
 
= Xen Scheduling =
 
This scheduling wiki page was originally compiled by [http://jacobmathai.blogspot.com Jacob Mathai].
 
   
  +
It also supports having more schedulers ''active'' at the same time, on disjoint groups of pCPUs (see [[Cpupools_Howto|cpupool]])
Xen includes kernel boot time options for scheduling. Similiar to traditional Linux schedulers that divide CPU time for userland processes, Below you will find some Xen & DomU options for the schedulers.
 
   
  +
[[File:Sched2.jpg|none|400px]]
== 1. Borrowed Virtual Time (Xen 2.0/3.0) ==
 
   
  +
In this case, each pool has its own scheduler. In fact, even if two pools use the ''same'' scheduler, this means they're using two completely different and isolated '''instances''' of the same scheduling algorithm.
<pre><nowiki>
 
sched=bvt
 
Global Parameters
 
ctx_allow - The context switch allowance is similar to the ''quantum'' in traditional schedulers.
 
It is the minimum time that a scheduled domain will be allowed to run before being preempted.
 
   
  +
The user interacts with and affects the behaviour of the scheduler by:
Per-domain parameters
 
  +
* checking or changing a scheduler's global parameters,
mcuadv - the MCU (Minimum Charging Unit) advance determines the proportional share of the CPU
 
  +
* checking or changing a VM's scheduling parameters.
that a domain receives. It is set inversely proportionally to a domain's sharing weight.
 
warp - the amount of `virtual time' the domain is allowed to warp backwards
 
warpl - the warp limit is the maximum time a domain can run warped for
 
warpu - the unwarp requirement is the minimum time a domain must run unwarped for before it can warp again
 
</nowiki></pre>
 
   
  +
[[File:Sched3.jpg|none|400px]]
== 2. Atropos (Xen 2.0) ==
 
   
  +
= Currently Available Schedulers =
<pre><nowiki>
 
sched=atropos
 
Atropos is a soft real time scheduler. It provides guarantees about absolute shares of the CPU,
 
with a facility for sharing slack CPU time on a best-effort basis. It can provide timeliness
 
guarantees for latency-sensitive domains.
 
   
  +
== The Credit Scheduler ==
Every domain has an associated period and slice. The domain should receive `slice' nanoseconds
 
every `period' nanoseconds. This allows the administrator to configure both the absolute share
 
of the CPU a domain receives and the frequency with which it is scheduled.
 
   
  +
[[Credit Scheduler|Credit]] is a general purpose, weighted fair share scheduler, and is the current default.
Note: don't over-commit the CPU when using Atropos (i.e. don't reserve more CPU than is
 
available -- the utilization should be kept to slightly less than 100% in order to ensure predictable
 
behavior).
 
   
  +
== The Credit2 Scheduler ==
Per-domain parameters :
 
period - The regular time interval during which a domain is guaranteed to receive its allocation of CPU time.
 
slice - The length of time per period that a domain is guaranteed to run for (in the absence of voluntary yielding of the CPU).
 
latency - The latency hint is used to control how soon after waking up a domain it should be scheduled.
 
xtratime - This is a boolean flag that specifies whether a domain should be allowed a share of the system slack time.
 
</nowiki></pre>
 
   
  +
[[Credit2 Scheduler Development|Credit2]] is the evolution of Credit, more scalable and better with latency sensitive workload, while still being based on a general purpose, weighted fair share, scheduling algorithm.
== 3. Round Robin (Xen 2.0) ==
 
   
  +
== The RTDS Scheduler ==
<pre><nowiki>
 
sched=rrobin
 
The round robin scheduler is included as a simple demonstration of Xen's internal scheduler
 
API. It is not intended for production use.
 
   
  +
[[RTDS-Based-Scheduler|RTDS]] is a real-time scheduler, meant at supporting real-time workloads in the cloud, as well as embedded and mobile virtualization use cases.
Global Parameters
 
rr_slice - The maximum time each domain runs before the next scheduling decision is made.
 
</nowiki></pre>
 
   
== 4. sEDF scheduler (Xen 3.0) ==
+
== The ARINC653 Scheduler ==
   
  +
[[ARINC653 Scheduler|ARINC653]] is an embedded (automotive and avionics) real-time scheduler.
<pre><nowiki>
 
sched=sedf
 
(from docs/misc/sedf_scheduler_mini-HOWTO.txt)
 
This scheduler provides weighted CPU sharing in an intuitive way and uses realtime-algorithms
 
to ensure time guarantees.
 
   
  +
= Use cases and Support Status =
Per-domain parameters
 
use "xm sched-sedf <dom-id> <period> <slice> <latency-hint> <extra> <weight>"
 
-period/slice are the normal EDF scheduling parameters in nanosecs
 
-latency-hint is the scaled period in case the domain is doing heavy I/O
 
(unused by the currently compiled version)
 
-extra is a flag (0/1), which controls whether the domain can run in extra-time
 
-weight is mutually exclusive with period/slice and specifies another way of setting a domains cpu slice
 
See wikipedia for a short intro to EDF:
 
http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling
 
</nowiki></pre>
 
   
  +
{|class="prettytable" style="text-align: left;" valign="top"
== 5. ARINC 653 (Xen 4.0) ==
 
  +
!style="width: 15%;"|Scheduler
  +
!style="width: 30%;"|Use-cases
  +
!style="width: 15%;"|Xen < 4.7
  +
!style="width: 15%;"|Xen 4.8
  +
!style="width: 15%;"|Xen 4.9
  +
!style="width: 15%;"|Xen 4.12
  +
|-
  +
|[[Credit_Scheduler|Credit]]
  +
|General Purpose
  +
|{{Tick}} Supported<br>{{Tick}} '''Default'''
  +
|Supported<br>'''Default'''
  +
|Supported<br>'''Default'''
  +
|Supported
  +
|-
  +
|[[Credit2_Scheduler_Development|Credit2]]
  +
|General Purpose<br>
  +
Optimized for low latency, scalability, high VM density
  +
|{{Tick}} Experimental
  +
|{{Tick}} Supported
  +
|Supported
  +
|Supported<br>'''Default'''
  +
|-
  +
|[[RTDS-Based-Scheduler|RTDS]]
  +
|Soft & Firm Real-time<br>Embedded, mobile & automotive<br>Graphics & Gaming in the Cloud
  +
|{{Tick}} Experimental
  +
|{{Tick}} Improved xl support<br>Experimental
  +
|Experimental
  +
|Experimental
  +
|-
  +
|[[ARINC653_Scheduler|ARINC 653]]
  +
|Hard Real-time <br>Avionics, Drones, Medical
  +
|[https://lists.xenproject.org/archives/html/xen-devel/2015-06/msg00972.html Supported?]
  +
|?
  +
|?
  +
|?
  +
|}
   
  +
= Historical Xen Schedulers =
<pre><nowiki>
 
sched=arinc653
 
The arinc653 scheduler follows the ARINC 653 specification for scheduling, giving each partition (domain) a
 
fixed, dedicated time slot for execution.
 
   
  +
== simple Earliest Deadline First (sEDF) ==
Note: Current implementation does not support multicore, so 'maxcpus=1' must be set at boot.
 
</nowiki></pre>
 
   
  +
Quoting from sEDF (not any longer) in-tree documentation, "this scheduler provides weighted CPU sharing in an intuitive way and uses real-time
= System Calls and Scheduling =
 
  +
algorithms to ensure time guarantees."
   
  +
The real-time algorithm used was [http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling Earliest Deadline First (EDF)], although it was modified for being used as a general purpose scheduler too. It could work in both work conserving and non-work conserving modes.
<pre><nowiki>
 
Some Scheduling System Calls
 
/schedule.c
 
SCHEDOP_yield
 
SCHEDOP_block
 
SCHEDOP_shutdown
 
*nice( )
 
getpriority( )
 
setpriority( )
 
sched_getscheduler( )
 
sched_setscheduler( )
 
sched_getparam( )
 
sched_setparam( )
 
sched_yield( )
 
sched_get_ priority_min( )
 
sched_get_ priority_max( )
 
sched_rr_get_interval( )
 
</nowiki></pre>
 
   
  +
It was introduced in Xen 3.0, and was the default for a while. The scheduler was never properly adapted for dealing with SMP systems and multi vCPUs VMs. Both were working, but behavior and performance were unideal and unreliable. It was eventually removed from Xen 4.6.
A related wiki topic on Real Time Applications & [[Preemption]] .
 
  +
  +
== Borrowed Virtual Time (BVT)==
  +
  +
A ''virtual time'' based fair-share, general purpose, scheduler in use in Xen 2.0 and 3.0. Domains's shares of CPU time were determined by their weights. What it is traditionally called ''quantum'', or ''timeslice'', was known there as '''context switch allowance''', and was configurable. It was SMP enabled, but lacked a non-work conserving mode.
  +
  +
== Atropos ==
  +
  +
A soft real-time scheduler, capable of providing guarantees on the absolute shares of CPU time, and allowing using the ''slack'' on a best-effort basis. Of course (as it's always the case in RT schedulers) CPU slices were only really guaranteed in absence of CPU over-commitment.
  +
  +
It was in use in Xen 2.0.
  +
  +
== Round Robin ==
  +
  +
It was... well... [https://en.wikipedia.org/wiki/Round-robin_scheduling Round Robin]! IT was there as a simple demonstration of Xen's internal scheduler API, not for real production use.
   
 
== Also See ==
 
== Also See ==
  +
* [[Credit Scheduler]]
 
  +
* '''sched=''' boot parameter in [http://xenbits.xen.org/docs/unstable/misc/xen-command-line.html Xen unstable boot options]
* [[Credit2 Scheduler Development]]
 
  +
* [[XL|xl]] [http://xenbits.xen.org/docs/unstable/man/xl.1.html#SCHEDULER-SUBCOMMANDS scheduler subcommands]
  +
* [[:Category:Scheduler]]
  +
* [[:Category:Resource Management]]
  +
* [[:Category:Performance]]
   
 
[[Category:Xen]]
 
[[Category:Xen]]
  +
[[Category:Xen 4.5]]
 
[[Category:Overview]]
 
[[Category:Overview]]
 
[[Category:Developers]]
 
[[Category:Developers]]
  +
[[Category:Users]]
  +
[[Category:Scheduler]]
  +
[[Category:Resource Management]]

Latest revision as of 04:30, 7 December 2019

Overview

The Xen Project Hypervisor supports several different virtual CPU schedulers, with different properties.

The job of an hypervisor's scheduler is to decide, among all the various vCPUs of the various virtual machines, which ones should run on the host's physical CPUs (pCPUs), at any given time.

It also supports having more schedulers active at the same time, on disjoint groups of pCPUs (see cpupool)

Sched2.jpg

In this case, each pool has its own scheduler. In fact, even if two pools use the same scheduler, this means they're using two completely different and isolated instances of the same scheduling algorithm.

The user interacts with and affects the behaviour of the scheduler by:

  • checking or changing a scheduler's global parameters,
  • checking or changing a VM's scheduling parameters.
Sched3.jpg

Currently Available Schedulers

The Credit Scheduler

Credit is a general purpose, weighted fair share scheduler, and is the current default.

The Credit2 Scheduler

Credit2 is the evolution of Credit, more scalable and better with latency sensitive workload, while still being based on a general purpose, weighted fair share, scheduling algorithm.

The RTDS Scheduler

RTDS is a real-time scheduler, meant at supporting real-time workloads in the cloud, as well as embedded and mobile virtualization use cases.

The ARINC653 Scheduler

ARINC653 is an embedded (automotive and avionics) real-time scheduler.

Use cases and Support Status

Scheduler Use-cases Xen < 4.7 Xen 4.8 Xen 4.9 Xen 4.12
Credit General Purpose Supported
Default
Supported
Default
Supported
Default
Supported
Credit2 General Purpose

Optimized for low latency, scalability, high VM density

Experimental Supported Supported Supported
Default
RTDS Soft & Firm Real-time
Embedded, mobile & automotive
Graphics & Gaming in the Cloud
Experimental Improved xl support
Experimental
Experimental Experimental
ARINC 653 Hard Real-time
Avionics, Drones, Medical
Supported? ? ? ?

Historical Xen Schedulers

simple Earliest Deadline First (sEDF)

Quoting from sEDF (not any longer) in-tree documentation, "this scheduler provides weighted CPU sharing in an intuitive way and uses real-time algorithms to ensure time guarantees."

The real-time algorithm used was Earliest Deadline First (EDF), although it was modified for being used as a general purpose scheduler too. It could work in both work conserving and non-work conserving modes.

It was introduced in Xen 3.0, and was the default for a while. The scheduler was never properly adapted for dealing with SMP systems and multi vCPUs VMs. Both were working, but behavior and performance were unideal and unreliable. It was eventually removed from Xen 4.6.

Borrowed Virtual Time (BVT)

A virtual time based fair-share, general purpose, scheduler in use in Xen 2.0 and 3.0. Domains's shares of CPU time were determined by their weights. What it is traditionally called quantum, or timeslice, was known there as context switch allowance, and was configurable. It was SMP enabled, but lacked a non-work conserving mode.

Atropos

A soft real-time scheduler, capable of providing guarantees on the absolute shares of CPU time, and allowing using the slack on a best-effort basis. Of course (as it's always the case in RT schedulers) CPU slices were only really guaranteed in absence of CPU over-commitment.

It was in use in Xen 2.0.

Round Robin

It was... well... Round Robin! IT was there as a simple demonstration of Xen's internal scheduler API, not for real production use.

Also See