Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 6 Apr 2000 09:39:49 +0930
From:      Greg Lehey <grog@lemis.com>
To:        Matthew Dillon <dillon@apollo.backplane.com>
Cc:        "Jonathan M. Bresler" <jmb@hub.freebsd.org>, brdean@unx.sas.com, phk@critter.freebsd.dk, blk@skynet.be, asmodai@wxs.nl, current@FreeBSD.ORG
Subject:   Re: Vinum breakage Summary (was Re: Vinum breakage)
Message-ID:  <20000406093949.M66569@freebie.lemis.com>
In-Reply-To: <200004051842.LAA79600@apollo.backplane.com>
References:  <20000405170350.103FC37BBA8@hub.freebsd.org> <200004051842.LAA79600@apollo.backplane.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wednesday,  5 April 2000 at 11:42:40 -0700, Matthew Dillon wrote:
>> Matt,
>> 	help me understand your patch.  this is how i read it at this
>> time:
>>
>>
>> Matt has just made available an early patch that corrects the vinum
>> panic.  is this the same vinum panic that people are claiming phk
>> created with the bio/buf changes?  i dont know the vinum code.  i dont
>> know the bio/buf code.  i do see that all the code changes are to
>> vinum source files.  none of the changes reference the buf/bio parts
>> for the kernel.
>>
>> the patch is against 4.0, code unaffected by phk's changes.
>> the revision levels are those indicated in Matt's patch and those I
>> recevied via CVSup last night.
>
>     There is a lot of confusion here, which I will straighten out:
>
>     * 3/20 - phk makes his first buffer cache commit, adding b_iocmd.
>       This breaks vinum, but it takes a while for people to realize it.
>       (see 1.46 sys/dev/vinum/vinumrequest.c and other files)
>
>       This commit is made into -current.  -stable is not effected
>
>     * 3/26 - alfred fixes phk's type-o that broke vinum (1.47 vinumrequest.c,
>       and other files).   (3/26 == last week).

As far as I can tell, this is correct.  I still haven't finished
checking the changes, and there's a possibility that some lesser-used
functions are still broken, though I have no direct evidence for that
beyond the fact they haven't been tested.

>     * During the last week, at aroudn the same time, a panic was traced
>       definitively to vinum.  On saturday it was traced to the raid5 code.
>
>       Some of the people using vinum, including Greg, are using it under
>       -current.

In fact, in view of what we've seen, this affects all recent versions
of Vinum, but the problem only shows up with RAID-5 and IDE.

>     * Phk begins making more radical commits (to -current) on sunday.
>
>     * Confusion reigns.  I don't think the later commits broke vinum again,
>       but at this point there were a number of people focused on vinum and
>       having the buffer cache ripped out from under them might have resulted
>       in false positives due to people using vinum as a kld rather then
>       building it into the kernel.  I believe there was a message or two
>       in this regard that turned out to be a false positive.

The only issue I see is that it required additional work to get
everything in sync.  It was nuisance value more than anything.

>     * Greg's test machine was running -current.  Greg is dead in the water
>       at this point (i.e. he would need to retool to -stable), and
>       complains mightily (and appropriately, I believe).

Well, I'm not "dead in the water", but I don't see any reason to build
a -STABLE machine at this point, since I can't reproduce the problem
on either.

>       Despites the truth that it would be better to track the vinum bug down
>       in -stable, the fact remains that many people are using -current.

I don't agree that it's better to track it in -STABLE.  I'm pretty
sure that the bug is unchanged in both releases.

>     * I spend five or six hours settings vinum up on my -stable test box to
>       try to reproduce and fix the panic on monday.
>
>     * I come up with a patch, which Greg is now reviewing & using as a
>       basis for the 'real' fix.  This patch fixes a bug in vinum -- we
>       knew there was one (on saturday, see above).  The only known bug
>       introduced by phk's commit (so far) was fixed by Alfred on 3/26.
>
>       This panic is not related to phk's commits.
>
>       My patch is relative to 4.0.

In fact, the relevant parts are also unchanged in -CURRENT.  I'm going
to commit what I consider to be the core of the fix right now.  I'm
still analysing the rest.

>     * It is currently unknown whether further breakage in -current exists
>       due to phk's changes.  I don't think we'll know whether there are
>       further problems until the currently known bug fix is committed and
>       we see where we stand.

Agreed, though I consider it unlikely.  phk's last lot of commits just
meant a lot of extra work when I could least use it.

>     * I also do not know if Greg has successfully retooled his test
>       box to run -stable.

Not yet.  I'll fix the problem in -CURRENT first.  Once I have
confirmation from sos that the patch fixes his problems, I'll commit
to -STABLE.

Greg
--
Finger grog@lemis.com for PGP public key
See complete headers for address and phone numbers


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20000406093949.M66569>