XenRT Architecture Guide

From Xen

Overview

This guide is intended for people who work on XenRT. It should be read together with the [XenRT_User_Guide]

Structure

XenRT consists of a number of layers:

  • Job control / Scheduling
  • Test sequences
  • Harness
  • Libraries
  • Testcases
  • Infrastructure

Infrastructure

XenRT provides a number of network infrastructure services. A XenRT deployment is based on a controller which owns the network on which it sits. The controller provides the following facilities to the network:

  • DHCP and PXE booting for test hosts and VMs.
  • WWW and NFS exported space for miscellaneous use.

Testcases

Testcases are Python classes that extend xenrt.TestCase. Testcases have two main methods that are called by the test harness, the prepare method (which is intended for set up activities such as installing VMs etc), and the run method, where the actual test case execution should happen, and the pass/fail decision should be made. Typically testcases will make extensive use of library objects.

Libraries

To make testcases easier to write XenRT provides a library code for common operations. In particular the library provides object-oriented models of hosts, VMs, storage repositories etc. These allow access to common functionality without having to write lots of code. For example, a generic linux VM can be created simply by calling the createGenericLinuxGuest method on a host object. These objects also allow a testcase to keep track of the state it expects the product to be in, such that it can then easily verify the state by calling appropriate check methods or similar.

Harness

The harness is the executable application (xrt) used to run testcases within a XenRT infrastructure. The harness provides the basic mechanisms to launch testcases, track testcase dependencies and record and report results.

Test sequences

Test sequences are collections of multiple testcases with option dependencies and parallelism.

Test sequences are run on one or more test servers, and should generally include related tests.

Test sequences also have a prepare section where a particular pool/host/VM configuration can be established.

Job control / Scheduling

The job control layer is a management function to manage multiple test jobs (where a job is the execution of a test sequence with a particular configuration on a particular server(s)) over multiple deployments. Job control executes the xrt command to run the test sequence and manages available servers and allocation of jobs to servers.

Job control is done using a Python CGI script (queue.cgi) which runs on a nominated 'master' XenRT controller. This script interacts with a PostgreSQL database for storing job information and results. A web interface is provided, which allows a user to bring up information on a job, and view lists of certain types of jobs (such as BVTs). The command line tool xenrt interacts with queue.cgi via HTTP GETs and POSTs, in order to submit jobs, and retrieve information etc.

Jobs can be in a number of states: new, running, done and removed.

Scheduling is performed by a cron job, that runs every two minutes on the 'master' controller. This calls xenrt schedule, which reviews the list of new jobs, and determines if appropriate machines are available. The machine requirements are constrained using the following parameters:

Parameter Description
MACHINES_REQUIRED The number of machines needed for the test (default 1)
RESOURCES_REQUIRED A slash separated list of resources, e.g. "memory>=4G/cores>=2/disk1>=50G"
SITE A comma separated list of specific XenRT clusters to use (e.g. SVCL01)
POOL A comma separated list of specific XenRT pools of machines to use (e.g. VMX,SVM)
MACHINE A comma separated list of specific machines to use

Notes

  • The order in which jobs are scheduled is based on jobid, and the optional JOBPRIO field (lower number = higher priority, defaults to 3)
  • The scheduler will ensure that all machines selected are in the same pool and site
  • Note that it is possible to submit a job that will never get scheduled
  • When a number of machines are available to suit a job's requirements, the scheduler will randomly choose machines from the list - this is to avoid over-working specific machines
  • Once suitable machines have been found, the scheduler sets the SCHEDULED_ON and MACHINE fields for the job to indicate which machines have been used, changes the job status to running, and updates the machine records in the database to indicate that they have been scheduled. The machine status for the 'main' machine is set to scheduled, any other machines are set to slaved.

On each XenRT controller, the site-controller script is run periodically by cron. This script fetches the list of machines the controller manages, and looks for any that have a status of scheduled. For those found, it then starts a XenRT harness process with the appropriate parameters, and updates the machine status to running.

Operation

The xrt harness runs either a single testcase (for debugging purposes usually) or a test sequence. xrt is launched either by hand from the command line or by the job control layer in a managed environment.

The harness maintains a registry of objects (hosts, VMs etc.). Older style testcases use objects created by earlier testcases in the same sequence. For example a VM install testcase may create a VM object which is inserted into the registry for later testcases, such as suspend/resume tests, to use. Note that the current model is that wherever possible a testcase should create the environment it requires in its prepare method, or if this is not practical, the sequence prepare method should be used, rather than using 'utility' testcases in the way described here.

Hosts and Guests (VMs)

Hosts in a test sequence are referred to by names RESOURCE_HOST_<n> with <n> from 0 upwards corresponding to the list of physical servers supplied to the harness to use with the sequence.

Guests can be added to the harness registry for later testcases to use. This will happen by default with guests installed by the sequence prepare section, and is optional for guests installed by testcases. The usual model is to have self contained testcases that install guests solely for their own use, and then uninstall them at the end of the test, i.e. they do not add them to the registry. Certain older legacy testcases do not install VMs themselves, and so require either a utility testcase, or the sequence prepare section to install the VM and add it to the registry. Testcases uninstalling guests should remove them from the registry. Registry lookup is by name or XML configuration string (see later). If names are used care must be taken to avoid duplicate names within a sequence.

Internals

File Structure

Within the XenRT codebase there are a number of key directories:

  • control: Files relating to the web interface (queue.cgi), and the command line xenrt tool. These are the job control / scheduling layer.
  • exec: Python code for the harness, libraries and testcases.
  • scripts: Script files that are copied to every host/VM used by XenRT.
  • seqs: XML sequence files specifying sets of testcases.
  • tests: Files required by testcases (note, 3rd party binaries etc are stored separately).

Code Structure

There are two 'top-level' modules within the XenRT system. The xenrt module, which contains the harness itself and the libraries, and the testcases module, which contains the testcases. These are visible as the directories xenrt and testcases within the exec directory.

Supported platforms

At present, there are four main types of supported platform that XenRT can execute tests against - open source Xen hosts (oss), Citrix XenServer hosts (xenserver), to a limited extent Microsoft Hyper-V hosts (satori), and native hosts (native). To support this, library objects are split up appropriately (for example, exec/xenrt/lib/xenserver as opposed to exec/xenrt/lib/oss). There are also 'generic' classes, which platform specific classes should extend - for example GenericHost and GenericGuest. These contain methods that are shared across all platforms (for example methods to interact with the XML-RPC daemon used in Windows VMs).

Other modules/code

main.py main.py (aka xrt) is the script which is actually executed either directly by the user, or by the job management layer. This will instantiate the necessary XenRT environment (see later), and start any sequence/testcase specified. It can also be used for debugging purposes to get an interactive XenRT shell using the -shell or -shell-logs options.

xenrt.config This module contains methods for accessing XenRT configuration. Configuration data can be obtained from a number of locations:

Hard coded in to config.py (e.g. supported OS's for each version of XenServer) Read from the controller's site.xml file (e.g. details of storage devices etc) Read from the machine's configuration file (e.g. MAC address details) Read from the harness command line or sequence file (job specific configuration) xenrt.util This module contains useful methods to save time, such as isUUID.

xenrt.powerctl This module contains classes for remotely controlling the power of machines. It supports a number of different methods, the most common of which being APC PDUs.

xenrt.registry This module contains methods for storing and retrieving data from the registry. The registry is used for storing objects such as hosts and guests between different tests.

xenrt.resources This module contains a variety of classes for resources, such as storage devices (e.g. iSCSI Luns, NetApp/EqualLogic filers etc), shared directories (NFS/HTTP/temporary/working), and network resources (test peers etc). The majority of these resources use a locking system implemented in the CentralResource base class to ensure they are only used by one instance of XenRT at a time on a controller, so for example multiple separate tests don't try to use the same iSCSI Lun.

xenrt.results This module contains classes used for managing results of tests.

xenrt.sequence This module contains the methods needed to use sequence files. It performs the XML parsing of these files, and also carries out necessary prepare steps etc.

xenrt.suite This module contains the functionality needed to manage a test suite. A test suite is defined in TestRun, and is essentially a collection of sequences.

xenrt.tools This module contains miscellaneous tools, the majority of which are used by special command line arguments to main.py.

Execution Environment

When you execute the XenRT test harness, a number of bits of environment are set up.

The Global Execution Context (GEC) This is an object used to store the current global execution state of the harness. It stores information such as configuration, lists of running tests, callbacks etc. Upon initialise it also creates the anonymous Test Execution Context.

XML-RPC server This allows the running test to be interacted with from a remote location - see the user guide for more info.

Test Execution Contexts (TECs) Each running test case runs inside its own Test Execution Context. This is used to keep separate logfiles and working directories for each test case, to avoid any collisions etc. At initialisation time an Anonymous TEC is created for any methods that run outside of a test case. The current TEC can be accessed at any time by calling xenrt.TEC() - this will either return the test specific TEC, or if one is not available, the anonymous TEC.

A TEC provides various methods for harness level operations, such as logging (e.g. xenrt.TEC().logverbose), configuration lookup (e.g. xenrt.TEC().lookup), and temporary file/directory creation (e.g. xenrt.TEC().tempFile). See the API docs for more information.

Exceptions There are four exceptions that testcases should be expected to raise.

Exception Description XRTError Used to indicate a harness error, i.e. a problem that is not a failure of the testcase XRTSkip Used within a subcase to indicate that the subcase has been skipped (note this should only be used in subcases) XRTFailure Used to indicate a failure of the testcase (i.e. a problem with the software under test) XRTBlocker Used within the harness if a testcase that has been marked in the sequence as a blocking test case fails (note this should not be directly raised by a testcase) These exceptions all extend XRTException, and take two parameters (neither are required, but by convention a reason should always be given):

Parameter Description reason Why the exception has been raised, i.e. a rough description of the problem data Any specific data relevant to the exception (e.g. values of certain fields etc) The reason parameter should in general not include any runtime specific data, since it will be used for tracking duplicates in Jira. Some information will automatically be masked in the autofile tracking, such as UUIDs, host names (unless prefixed by an exclamation mark), IP addresses and pids.

Any other exceptions raised by a testcase (e.g. due to testcase bugs such as undefined variables, lookup failures etc) will be interpreted by the harness as Template:XRTErrors...

Testcase Structure

Testcases are executed based on the following logic:

Optionally wait on a per-test semaphore Run the testcase's prepare method Run the testcase's run method Optionally wait for user interaction depending on PAUSE_ON_ settings Run the testcase's preLogs method Collect logs from relevant hosts and guests Run the testcase's postRun method Exceptions generated in the prepare method will result in the test erroring. XRTFailure exceptions generated in the run method will result in the test failing, any other type of exception will result in a test error.