Date: Tue, 20 Sep 2005 16:04:43 -0400 From: John Baldwin <jhb@FreeBSD.org> To: Koen Martens <fbsd@metro.cx> Cc: freebsd-hackers@freebsd.org, Dimitry Andric <dimitry@andric.com>, Vinod Kashyap <vkashyap@amcc.com> Subject: Re: panic in propagate_priority w/ postgresql under heavy load Message-ID: <200509201604.44393.jhb@FreeBSD.org> In-Reply-To: <432F1310.80007@metro.cx> References: <2B3B2AA816369A4E87D7BE63EC9D2F269B7B4D@SDCEXCHANGE01.ad.amcc.com> <432F1310.80007@metro.cx>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 19 September 2005 03:35 pm, Koen Martens wrote: > Vinod Kashyap wrote: > > You seem to be booting off of a 9000 (twa) controller and not 7000/8000 > > (twe). > > It could be because of a 9000 firmware bug that you are not being able > > to > > get the dump. The firmware wrongly interprets physical address 0x0 as > > invalid > > during dumps, and fails the operations. This bug will be fixed in > > future > > firmware releases. > > Ok, it's been a while, here is an update on this. > > I ran a heavily instrumented kernel for two weeks on the server, it > did not crash in that time. I then took out the witness and kdb/ddb > stuff, because the decreased performance was a bit of a nuisance, > however i retained the ability to obtain a crash dump. I had to > limit physical memory, put it on 1.8GB in loader.conf:hw.physmem > because swap and physmem are both 2GB. Tested with 'reboot -d' gave > me a core dump. > > Without the debug stuff in the kernel, it crashed within 2 days, > same story: postgresql process, function propagate_priority. > However, no dump was written to disk :( > > Furthermore, i've been seeing the same crash (in propagate_priority) > on another box in mysql processes. Both servers seem to panic every > 2-3 days. I have another server of the exact same hardware > configuration, but it is mainly idling most of the time. Haven't > seen that one crash yet. > > I am thinking now that it is a bug in the twa driver, so i'll have > to dig in to that. Furthermore, it seems to have to do with some > sort of concurrency issue or otherwise timing-sensitive issue, > because slowing the kernel down with debug code seems to avoid the > panic. But, as i am completely new to the freebsd kernel and don't > even know what turnstiles are, i imagine i will have a hard time. So > if anyone can offer some help, please :) > > Ok, thanks for your attention, This panic usually happens either because a thread went to sleep while holding a mutex (WITNESS will warn you about this when it happens, but as you noted, it slows things down). It can also happen perhaps if a thread exits while holding a lock or if a thread is blocked on a mutex that is destroyed after it blocks on it. -- John Baldwin <jhb@FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200509201604.44393.jhb>