Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 19 Sep 2005 21:35:44 +0200
From:      Koen Martens <fbsd@metro.cx>
To:        Vinod Kashyap <vkashyap@amcc.com>
Cc:        freebsd-hackers@freebsd.org, Dimitry Andric <dimitry@andric.com>
Subject:   Re: panic in propagate_priority w/ postgresql under heavy load
Message-ID:  <432F1310.80007@metro.cx>
In-Reply-To: <2B3B2AA816369A4E87D7BE63EC9D2F269B7B4D@SDCEXCHANGE01.ad.amcc.com>
References:  <2B3B2AA816369A4E87D7BE63EC9D2F269B7B4D@SDCEXCHANGE01.ad.amcc.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Vinod Kashyap wrote:
> You seem to be booting off of a 9000 (twa) controller and not 7000/8000
> (twe).
> It could be because of a 9000 firmware bug that you are not being able
> to
> get the dump.  The firmware wrongly interprets physical address 0x0 as
> invalid
> during dumps, and fails the operations.  This bug will be fixed in
> future
> firmware releases.

Ok, it's been a while, here is an update on this.

I ran a heavily instrumented kernel for two weeks on the server, it
did not crash in that time. I then took out the witness and kdb/ddb
stuff, because the decreased performance was a bit of a nuisance,
however i retained the ability to obtain a crash dump. I had to
limit physical memory, put it on 1.8GB in loader.conf:hw.physmem
because swap and physmem are both 2GB. Tested with 'reboot -d' gave
me a core dump.

Without the debug stuff in the kernel, it crashed within 2 days,
same story: postgresql process, function propagate_priority.
However, no dump was written to disk :(

Furthermore, i've been seeing the same crash (in propagate_priority)
on another box in mysql processes. Both servers seem to panic every
2-3 days. I have another server of the exact same hardware
configuration, but it is mainly idling most of the time. Haven't
seen that one crash yet.

I am thinking now that it is a bug in the twa driver, so i'll have
to dig in to that. Furthermore, it seems to have to do with some
sort of concurrency issue or otherwise timing-sensitive issue,
because slowing the kernel down with debug code seems to avoid the
panic. But, as i am completely new to the freebsd kernel and don't
even know what turnstiles are, i imagine i will have a hard time. So
if anyone can offer some help, please :)

Ok, thanks for your attention,

Koen

-- 
K.F.J. Martens, Sonologic, http://www.sonologic.nl/
Networking, hosting, embedded systems, unix, artificial intelligence.
Public PGP key: http://www.metro.cx/pubkey-gmc.asc
Wondering about the funny attachment your mail program
can't read? Visit http://www.openpgp.org/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?432F1310.80007>