Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Jan 2001 11:41:38 +1030
From:      Greg Lehey <grog@lemis.com>
To:        Roman Shterenzon <roman@xpert.com>
Cc:        Daniel Lang <dl@leo.org>, freebsd-stable@freebsd.org
Subject:   Re: Vinum saga continues
Message-ID:  <20010104114138.H4336@wantadilla.lemis.com>
In-Reply-To: <Pine.LNX.4.30.0101040217000.19919-100000@jamus.xpert.com>; from roman@xpert.com on Thu, Jan 04, 2001 at 02:33:18AM %2B0200
References:  <20010104103426.C4336@wantadilla.lemis.com> <Pine.LNX.4.30.0101040217000.19919-100000@jamus.xpert.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday,  4 January 2001 at  2:33:18 +0200, Roman Shterenzon wrote:
> On Thu, 4 Jan 2001, Greg Lehey wrote:
>
>> [Format recovered--see http://www.lemis.com/email/email-format.html]
>>
>> On Wednesday,  3 January 2001 at 14:15:14 +0200, Roman Shterenzon wrote:
>>> Hi,
>>>
>>> Attached is the most valuable information that was in my pr 22103.
>>> I've read the vinumdebug and the other guy's PR.
>>> I'm still not getting what is missing.
>>> You told the other guy to submit the backtrace, but it was in fact submitted!
>>> It's as well in my PR as well.
>>> Your responses are very brief - "please read vinumdebug", but in fact, if
>>> there's something that is missing, you can be more specific.
>>
>> OK.  I don't know what's so difficult about this, but here we go:.  On
>> the web page to which I refer, I say:
>>
>>   If you need to contact me because of problems with Vinum, please send
>>   me a mail message with the following information:
>>
>>   - What problems are you having?
>>   - Which version of FreeBSD are you running?
>>
>>     I can't find this in your report.
>
> So I was right. You didn't read it.

I thought this was supposed to be complete information.

>>   - If you have a crash, please supply a backtrace from the dump
>>     analysis as discussed below under Kernel Panics.  Please don't
>>     delete the crash dump; it may be needed for further analysis.
>>
>> Basically, all I can see here is the backtrace, which is still wrapped
>> at 80 characters, despite all my requests.  I've had to manually
>> reformat it to make it legible.  Have you really read the web page?
>
> Hmm.. I don't know why it wrapped around at 80 chars.
> I took it out of the "Raw PR"

Well, the "raw PR" was pretty rough too, but it's probably your MUA.

>> OK, this is *not* the buffer header corruption bug, but it's happening
>> in a very similar position.  With the buffer header corruption, you
>> wouldn't have got as far as this, because b_iodone would be zeroed
>> out.  I also can't see any other obvious damage to the buffer header.
>>
>> What we need to do now is to find out where the trap occurred.  That's
>> at line 199 of complete_rqe, which shows as the very end of the
>> function.  Could you give me the following information from gdb,
>> please?
>>
>>  (gdb) x/20i 0xc150fc60
>>
>> Thanks
>
> Heh :( I wish you read the PR when I submitted it. It was there. I
> only took it out of the closed PR and resent to you.  This crash is
> not available anymore. I used the disks for RAID-1 setup.

OK, this is another one that got away.  I can't do any more without
followup information.

> But, for curiosity, what this command supposedly does?

It disassembles 20 instructions starting at the specified address.  It
will give us a better idea of what's going on.

> The fact that the page fault occures at the end of the function,
> i.e.  "return" hints that perhaps the return address from the call
> was smashed.

Ah, that might be what Alfred was referring to.  Yes, we could have
checked for that.  But under those circumstances you would also expect
to get a junk stack trace (not necessarily, but usually).  Without a
dump, of course, we'll never know.

> I think that the crash can be reproduced - three disks in raid5
> setup, more than 50% filled. find /raid -print should crash it.

Believe me, I tried that again and again.  It worked fine here.

> I can send you my setup. I'd 3 x 36Gb IBM drives on adaptec.  My
> mutilated and huge PR kern/22103 has more info (dmesg!) that you may
> or may not find interesting.

No, without a dump it's not much help.  I *do* now have an fxp board
here, so I should try that first.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010104114138.H4336>