Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Dec 2004 15:53:44 +0000
From:      Tony Byrne <freebsd-current@byrnehq.com>
To:        Scott Long <scottl@freebsd.org>
Cc:        freebsd-stable@freebsd.org
Subject:   Re[2]: MegaRAID 'Bad Slot' Kernel message and crash.
Message-ID:  <1694776352.20041229155344@byrnehq.com>
In-Reply-To: <41D2C32C.7090803@freebsd.org>
References:  <187186864.20041229111855@byrnehq.com> <41D2C32C.7090803@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hello Scott,

Wednesday, December 29, 2004, 2:46:04 PM, you wrote:

SL> I've been seeing this problem recently too.  I believe that there is
SL> some sort of timing bug/race in the driver, but I haven't been able to
SL> figure it out yet.  It also seems to be related to panic from the block
SL> layer that point to commands being completed twice.  To be clear with
SL> your observations, are you saying that 4.10-RELEASE is behaving the same
SL> or differently than 4.10-STABLE?

We tried 5.3 just after RELEASE, if I recall correctly, but had updated our
sources and rebuilt world before running our tests.  Under 5.3 we wedged
the controller a number of times in the space of 3 days each with a
"bad slot" kernel message.

Once we decided that 5.3 was not a going to cut it for us, we
downgraded to 4.10-STABLE (circa 16th Nov) and re-ran our tests,
this time we couldn't wedge the system.  The server has been in
production on 4.10-STABLE for about a month and yesterday was the
first "bad slot" wedge we've seen.  I'd hate to think that we can now
look forward to a monthly trip to the hosting facility to hard reset the
box :-(

Regards,

Tony.

-- 
Tony Byrne




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1694776352.20041229155344>