From owner-freebsd-current Fri Mar 14 17:32:30 2003 Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5219A37B404 for ; Fri, 14 Mar 2003 17:32:28 -0800 (PST) Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by mx1.FreeBSD.org (Postfix) with ESMTP id 367D743F75 for ; Fri, 14 Mar 2003 17:32:25 -0800 (PST) (envelope-from grog@lemis.com) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 3D55751A58; Sat, 15 Mar 2003 12:02:23 +1030 (CST) Date: Sat, 15 Mar 2003 12:02:23 +1030 From: Greg 'groggy' Lehey To: Vallo Kallaste Cc: Darryl Okahata , current@FreeBSD.org Subject: Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk] Message-ID: <20030315013223.GC90698@wantadilla.lemis.com> References: <20030220200317.GA5136@kevad.internal> <200302202228.OAA03775@mina.soco.agilent.com> <20030221080046.GA1103@kevad.internal> <20030227012959.GA89235@wantadilla.lemis.com> <20030227095302.GA1183@kevad.internal> <20030301184310.GA631@kevad.internal> <20030314024602.GL77236@wantadilla.lemis.com> <20030314080528.GA1174@kevad.internal> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="DSayHWYpDlRfCAAQ" Content-Disposition: inline In-Reply-To: <20030314080528.GA1174@kevad.internal> User-Agent: Mutt/1.4i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 9A1B 8202 BCCE B846 F92F 09AC 22E6 F290 507A 4223 Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --DSayHWYpDlRfCAAQ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Friday, 14 March 2003 at 10:05:28 +0200, Vallo Kallaste wrote: > On Fri, Mar 14, 2003 at 01:16:02PM +1030, Greg 'groggy' Lehey > wrote: > >>> So I did. Loaned two SCSI disks and 50-pin cable. Things haven't >>> improved a bit, I'm very sorry to say it. >> >> Sorry for the slow reply to this. I thought it would make sense to >> try things out here, and so I kept trying to find time, but I have to >> admit I just don't have it yet for a while. I haven't forgotten, and >> I hope that in a few weeks time I can spend some time chasing down a >> whole lot of Vinum issues. This is definitely the worst I have seen, >> and I'm really puzzled why it always happens to you. >> >>> # simulate disk crash by forcing one arbitrary subdisk down >>> # seems that vinum doesn't return values for command completion status >>> # checking? >>> echo "Stopping subdisk.. degraded mode" >>> vinum stop -f r5.p0.s3 # assume it was successful >> >> I wonder if there's something relating to stop -f that doesn't happen >> during a normal failure. But this was exactly the way I tested it in >> the first place. > > Thank you Greg, I really appreciate your ongoing effort for making > vinum stable, trusted volume manager. > I have to add some facts to the mix. Raidframe on the same hardware > does not have any problems. The later tests I conducted was done > under -stable, because I couldn't get raidframe to work under > -current, system did panic everytime at the end of initialisation of > parity (raidctl -iv raid?). So I used the raidframe patch for > -stable at > http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz > Had to do some patching by hand, but otherwise works well. I don't think that problems with RAIDFrame are related to these problems with Vinum. I seem to remember a commit to the head branch recently (in the last 12 months) relating to the problem you've seen. I forget exactly where it went (it wasn't from me), and in cursory searching I couldn't find it. It's possible that it hasn't been MFC'd, which would explain your problem. If you have a 5.0 machine, it would be interesting to see if you can reproduce it there. > Will it suffice to switch off power for one disk to simulate "more" > real-world disk failure? Are there any hidden pitfalls for failing > and restoring operation of non-hotswap disks? I don't think so. It was more thinking aloud than anything else. As I said above, this is the way I tested things in the first place. Greg -- See complete headers for address and phone numbers --DSayHWYpDlRfCAAQ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.0 (FreeBSD) iD8DBQE+coKnIubykFB6QiMRAglrAJ9RK4WLiehlWrPaoYANMlTcIerGAwCfZAaD qvCzLjatwYLqe2QRhBAwC2k= =9a8e -----END PGP SIGNATURE----- --DSayHWYpDlRfCAAQ-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message