From owner-freebsd-scsi Mon Mar 6 0: 8:51 2000 Delivered-To: freebsd-scsi@freebsd.org Received: from mojave.worldwide.lemis.com (dialup98.sydney.net.au [202.61.208.38]) by hub.freebsd.org (Postfix) with ESMTP id CFA6837BC65; Mon, 6 Mar 2000 00:08:41 -0800 (PST) (envelope-from grog@mojave.worldwide.lemis.com) Received: (from grog@localhost) by mojave.worldwide.lemis.com (8.9.3/8.9.3) id SAA00373; Mon, 6 Mar 2000 18:45:53 +1100 (EST) (envelope-from grog) Date: Mon, 6 Mar 2000 18:45:53 +1100 From: Greg Lehey To: David Aronchick Cc: freebsd-scsi@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG Subject: Re: Vinum vs Adaptec AIC 7890? Message-ID: <20000306184553.A332@mojave.worldwide.lemis.com> Reply-To: Greg Lehey References: <3104082854.952291855@aronchick> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <3104082854.952291855@aronchick>; from aronchick@archegenesis.com on Sun, Mar 05, 2000 at 09:30:55PM -0500 WWW-Home-Page: http://www.lemis.com/~grog X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Organization: LEMIS, PO Box 460, Echunga SA 5153, Australia Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-41-739-7062 Sender: owner-freebsd-scsi@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Sunday, 5 March 2000 at 21:30:55 -0500, David Aronchick wrote: > Hi-- > > I've had the following problems... > we're currently running with 3x18 GB 10k Segate drives and an Asus p2b-ds > with > onboard scsi card. The drives are divided into 3 partitions /usr /var / > and a RAID 5 of 26 GB. > > I was able to recover by just doing vinum start on the stale drive, but it > brought the system down, and i need to make sure this doesn't happen again. > Does anyone have any suggestions? Is this a CAM or vinum problem? or should > I look to hardware. Here's the standard list. Hmm. There are a couple of things missing here, like the dump. The log files show that there's something wrong with /dev/da1 (unrecovered data error; you should check whether you have ARRE and AWRE set), but that shouldn't cause the system to hang up. I'd strongly doubt that the problem has anything to do with the host adapter. I don't know of anything in Vinum which would cause these problems either, so we'd really need to know more details. Sorry I can't give you any more ideas, but there's not much to go on. > What problems are you having? Using lftp, I was in the midst of > ftping a 50 MB or so file to the raid directly. After about 5% was > done, the entire box froze. As it is remote, I don't know if the > drives were accessing, but all open ssh sessions just stopped, i > could ping and nmap the box, but when I tried to initiate an ssh, > they would just open, and sit there. I've previously been able to > copy a couple of hundred MB back and forth, with seemingly no > problems. That was a few days ago. > > Which version of FreeBSD are you running? > > 3.4-STABLE > > Have you made any changes to the system sources, including Vinum? > No, everything is unchanged. > > Kernel stuff: > # vinum list > Configuration summary > > Drives: 3 (4 configured) > Volumes: 1 (4 configured) > Plexes: 1 (8 configured) > Subdisks: 3 (16 configured) > > D d0 State: up Device /dev/da0s2e Avail: 311/15311 MB (2%) > D d1 State: up Device /dev/da1s2e Avail: 311/15311 MB (2%) > D d2 State: up Device /dev/da2s2e Avail: 311/15311 MB (2%) > > V raid5 State: up Plexes: 1 Size: 29 GB > > P raid5.p0 R5 State: degraded Subdisks: 3 Size: 29 GB > > S raid5.p0.s0 State: up PO: 0 B Size: 14 GB > S raid5.p0.s1 State: stale PO: 512 kB Size: 14 GB > S raid5.p0.s2 State: up PO: 1024 kB Size: 14 GB > > # tail -100 /var/log/messages > > [...] > Mar 5 13:25:18 db /kernel: (da1:ahc0:0:1:0): READ(10). CDB: 28 0 0 41 82 > 4e 0 0 80 0 > Mar 5 13:25:18 db /kernel: (da1:ahc0:0:1:0): MEDIUM ERROR info:41825b > asc:11,0 > Mar 5 13:25:18 db /kernel: (da1:ahc0:0:1:0): Unrecovered read error field > replaceable unit: e4 sks:80,101 > Mar 5 13:25:18 db /kernel: raid5.p0.s1: fatal read I/O error > Mar 5 13:25:18 db /kernel: vinum: raid5.p0.s1 is crashed by force > Mar 5 13:25:18 db /kernel: vinum: raid5.p0 is degraded > Mar 5 13:25:18 db /kernel: raid5.p0.s1: fatal write I/O error > Mar 5 13:25:18 db /kernel: vinum: raid5.p0.s1 is stale by force > Mar 5 16:52:08 db /kernel: Copyright (c) 1992-1999 FreeBSD Inc. > [ the machine was manually rebooted 3 hours later ] Greg -- Finger grog@lemis.com for PGP public key See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-scsi" in the body of the message