Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 26 Aug 2007 00:07:08 +1200
From:      Andrew Turner <andrew@fubar.geek.nz>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: FreeBSD on xen hvm
Message-ID:  <20070826000708.15fbb5bb@hermies.int.fubar.geek.nz>
In-Reply-To: <20070824132409.W3900@fledge.watson.org>
References:  <20070824181627.57bed401@hermies.int.fubar.geek.nz> <20070824132409.W3900@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 24 Aug 2007 13:30:17 +0100 (BST)
Robert Watson <rwatson@FreeBSD.org> wrote:
> On Fri, 24 Aug 2007, Andrew Turner wrote:
> 
> > 1) PREEMPTION Preemption causes the kernel to panic with a page
> > fault. The dmesg is available from [1].
> 
> Any chance it's possible to get a core for this, or attach GDB
> somehow to the VM?
I haven't managed to get either remote GDB working and it's too early
in the boot for a core. I can get a xen core dump but it would require
processing to get it into something gdb could use.
>  It looks like timing in Xen may be exposing a
> race in some or another subsystem with timers, but figuring out which
> subsystem it is will be most easily done if we can inspect the
> callout information, which is most easily done with GDB since you can
> inspect the callout structure more easily.  If not, then we can add
> some printfs to extract the information, I expect, or extend DDB.  We
> need to find out what the function pointer in the callout structure
> is for.
I've created a patch at [1] to add "show callouts" to ddb. It prints
all the callouts in callwheel and the name of the function they call.
The callouts with preemption are:
loadav
in6_tmpaddrtimer
in6_rtqtimo
in_rtqtimo
in6_mtutimo
uma_timeout
nd6_slowtimo
nfsrv_timer
tcp_isn_tick
scrn_timer
roundrobin
atkbd_timeout
sleepq_timeout
sleepq_timeout
sleepq_timeout
sleepq_timeout
pffasttimo
pfslowtimo
kbdmux_kbd_intr_timo
if_slowtimo
ipport_tick
nd6_timer
lboltcb
tcp_hc_purge

Preemption does not always cause the kernel to panic, however when it
doesn't it shows the mountroot> prompt and is unable to load the root
as no disk drives show up.
> 
> > 3) INVARIANTS Invariants causes a panic from a page fault. See [2]
> > for the dmesg and backtrace.
> 
> This appears to be in the start up of Audit as it creates a kernel
> thread. Possibly it's creating the thread too early, or possibly
> something else is going on.  Can you try creating a kernel without
> options AUDIT and see if it works better, or if it just panics when
> the next thread is created?
It just panics in the next thread created.
> 
> It sounds like Xen may start the timer firing sooner than on plain
> hardware, or possibly at a faster rate initially, and that's causing
> things to happen in a different order, so I expect we'll either bump
> into a series of races of this sort based on different ordering of
> events, or discover the timer isn't properly being disabled or
> managed in Xen :-).
I'm suspecting the timer isn't being managed properly. The timer in the
loader always stays at 10 and with DIAGNOSTIC I'm getting lines like:
Expensive timeout(9) function: 0xc097da70(0xc0bbaa00) -1.982636062 s

Andrew

[1] http://fubar.geek.nz/files/freebsd/ddb-callout.diff

-- 
Andrew Turner
http://fubar.geek.nz/blog/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070826000708.15fbb5bb>