Date: Sun, 26 Aug 2007 00:07:08 +1200 From: Andrew Turner <andrew@fubar.geek.nz> To: Robert Watson <rwatson@FreeBSD.org> Cc: freebsd-current@freebsd.org Subject: Re: FreeBSD on xen hvm Message-ID: <20070826000708.15fbb5bb@hermies.int.fubar.geek.nz> In-Reply-To: <20070824132409.W3900@fledge.watson.org> References: <20070824181627.57bed401@hermies.int.fubar.geek.nz> <20070824132409.W3900@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 24 Aug 2007 13:30:17 +0100 (BST) Robert Watson <rwatson@FreeBSD.org> wrote: > On Fri, 24 Aug 2007, Andrew Turner wrote: > > > 1) PREEMPTION Preemption causes the kernel to panic with a page > > fault. The dmesg is available from [1]. > > Any chance it's possible to get a core for this, or attach GDB > somehow to the VM? I haven't managed to get either remote GDB working and it's too early in the boot for a core. I can get a xen core dump but it would require processing to get it into something gdb could use. > It looks like timing in Xen may be exposing a > race in some or another subsystem with timers, but figuring out which > subsystem it is will be most easily done if we can inspect the > callout information, which is most easily done with GDB since you can > inspect the callout structure more easily. If not, then we can add > some printfs to extract the information, I expect, or extend DDB. We > need to find out what the function pointer in the callout structure > is for. I've created a patch at [1] to add "show callouts" to ddb. It prints all the callouts in callwheel and the name of the function they call. The callouts with preemption are: loadav in6_tmpaddrtimer in6_rtqtimo in_rtqtimo in6_mtutimo uma_timeout nd6_slowtimo nfsrv_timer tcp_isn_tick scrn_timer roundrobin atkbd_timeout sleepq_timeout sleepq_timeout sleepq_timeout sleepq_timeout pffasttimo pfslowtimo kbdmux_kbd_intr_timo if_slowtimo ipport_tick nd6_timer lboltcb tcp_hc_purge Preemption does not always cause the kernel to panic, however when it doesn't it shows the mountroot> prompt and is unable to load the root as no disk drives show up. > > > 3) INVARIANTS Invariants causes a panic from a page fault. See [2] > > for the dmesg and backtrace. > > This appears to be in the start up of Audit as it creates a kernel > thread. Possibly it's creating the thread too early, or possibly > something else is going on. Can you try creating a kernel without > options AUDIT and see if it works better, or if it just panics when > the next thread is created? It just panics in the next thread created. > > It sounds like Xen may start the timer firing sooner than on plain > hardware, or possibly at a faster rate initially, and that's causing > things to happen in a different order, so I expect we'll either bump > into a series of races of this sort based on different ordering of > events, or discover the timer isn't properly being disabled or > managed in Xen :-). I'm suspecting the timer isn't being managed properly. The timer in the loader always stays at 10 and with DIAGNOSTIC I'm getting lines like: Expensive timeout(9) function: 0xc097da70(0xc0bbaa00) -1.982636062 s Andrew [1] http://fubar.geek.nz/files/freebsd/ddb-callout.diff -- Andrew Turner http://fubar.geek.nz/blog/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20070826000708.15fbb5bb>