Debugging Xen

From Xen


This page describes a set of things to look at when trying to debug what appears to be errant Xen Behavior. Providing as much information as possible below with bugs reported to the Xen project will make it much easier for developers to figure out what has gone wrong, and fix the issue.

If you think you found a bug, please let the developers know, by sending an e-mail to the xen-devel@lists.xen.org. For more details on how to do this, please, check out this other Wiki page: Reporting Bugs against Xen. Asking about people having had similar experience on IRC or on the user maailing list might be a good idea.

There is a tool called xen-bugtool, aimed at collecting the logs and all the various information from your system, but it is not in a good shape at the moment. We are working on reviving (or having a new one in place), though.

Sources of Information

Please include the following information in all bugzilla reports (unless completely inappropriate):

  • Before capturing any logs make sure you're using "loglvl=all guest_loglvl=all" Xen hypervisor (xen.gz) boot options in your bootloader (grub) settings! Reboot after making changes to bootloader settings.
  • Dom0 Operating System (including distribution, and version)
  • Hardware Platform (UP, SMP, number & type of disk/nics)
  • Kernel config for dom0 and domU (if modified from default)
  • Output from dmesg under dom0
  • Output from dmesg under domU
  • Output from serial port if Xen oops
  • Complete oops message or kernel panic
  • System.map if Linux oops/panic for appropriate kernel
  • grub entry for domain

The following will be collected by xen-bugtool, if you use it:

  • Architecture (x86, x86 w/ pae, x86_64, vt/svm)
  • Output from xl dmesg under dom0
  • Output of xl info under dom0 (this will include xen revision number)
  • Lots of log files

If this is a networking bug, please try to include the following if possible: (before and after failure, if applicable)

  • ifconfig -a output
  • brctl output (if using bridging)
  • netstat -s
  • netstat -rn
  • your domain configuration file
  • your interface configuration mechanism (dhcp, private, ifcfg-* files, etc)
  • details on what works/doesn't (can you ping internal, external? is only tcp broken? etc)
  • a tcpdump or ethereal trace capture of the connection, or a brief snapshot if too large

Debugging Xenstore Problems

If you suspect that the wrong values are being written to or read from Xenstore, then you may turn on xenstore tracing to help debug the problem:

  1. Set XENSTORED_TRACE=1 into /etc/{sysconfig,default}/xencommons.
  2. Reboot.
  3. xenstored will then write a trace file, giving every read and write on the store. Use xen-bugtool to attach that and all your other logs to a Bugzilla entry.

Some easy steps to enable xend/xenstored tracing

# service xend stop
# killall -9 xenstored
# export XEND_DEBUG=1
# export XENSTORED_TRACE=1
# /usr/sbin/xend trace_start >/dev/null 2>&1 &


  • xend pid file: /var/run/xend.pid
  • xend trace file: /var/log/xen/xend.trace
  • xend log file (check /etc/xen/xend-config.sxp: (logfile ...) default: /var/log/xen/xend.log
  • xend debug log: /var/log/xen/xend-debug.log
  • xenstored pid file: /var/run/xenstore.pid
  • xenstored trace file: /var/log/xen/xenstored-trace.log

Note: on some system, xend trace doesn't work. As a workaround, you can enable xenstored trace only by applying this patch:

--- /usr/sbin/xend.orig 2008-11-24 16:09:16.000000000 +0800
+++ /usr/sbin/xend      2008-11-20 11:59:20.000000000 +0800
@@ -78,7 +78,8 @@ def check_user():
 def start_xenstored():
     XENSTORED_TRACE = os.getenv("XENSTORED_TRACE")
     cmd = "xenstored --pid-file /var/run/xenstore.pid"
-    if XENSTORED_TRACE:
+    #if XENSTORED_TRACE:
+    if True:
         cmd += " -T /var/log/xen/xenstored-trace.log"
     s,o = commands.getstatusoutput(cmd)
 
And then: 
# service xend stop
# killall -9 xenstored
# service xend start


Note: in some system, kill xenstored will cause dom0 not work well(no hotplug event generated). You should restart dom0 to resolve this problem.

Reference