From owner-freebsd-stable Mon Sep 16 7:57:57 2002 Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8A60C37B400 for ; Mon, 16 Sep 2002 07:57:53 -0700 (PDT) Received: from newnet.co.uk (newnet.co.uk [212.87.80.12]) by mx1.FreeBSD.org (Postfix) with ESMTP id A90D243E42 for ; Mon, 16 Sep 2002 07:57:52 -0700 (PDT) (envelope-from jamie@jamiesdomain.org.uk) Received: from BONG (perry-gw-nat1-eth1.router.trident-uk.co.uk [81.3.89.49]) by newnet.co.uk (8.12.3/8.12.3) with SMTP id g8GEvUNE082315; Mon, 16 Sep 2002 15:57:30 +0100 (BST) (envelope-from jamie@jamiesdomain.org.uk) Message-ID: <001201c25dd3$56dc0e60$3764a8c0@BONG> Reply-To: "Jamie Heckford" From: "Jamie Heckford" To: , "Drew Derbyshire" References: <20020916122410.9C5B8BA14@pandora.hh.kew.com> Subject: Re: vinum / 4.6.2 / mirrored drives Date: Mon, 16 Sep 2002 15:49:35 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Newnet-MailScanner: Found to be clean Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, I've had quite a few problems _similar_ to this in the past, most have been solved by replacing the SCSI cable and double checking the termination believe it or not! There is most likely a proper explanation for your problem, but it couldn't hurt checking its all ok. Cheers Jamie ----- Original Message ----- From: "Drew Derbyshire" To: Sent: Monday, September 16, 2002 5:24 AM Subject: vinum / 4.6.2 / mirrored drives > Short version: Are there known issues with vinum used for mirroring > SCSI drives under 4.6.2? > > Long version ... > > I have a Dell Gx1/PII 350 with mirrored SCSI via vinum... > > FreeBSD pandora.hh.kew.com 4.6.2-RELEASE FreeBSD 4.6.2-RELEASE #16: Fri Aug 16 23:23:07 EDT 2002 ahd@pandora.hh.kew > .com:/usr/scratch/obj/usr/src/sys/DELL_GX1 i386 > > ahc0: port 0xc800-0xc8ff mem 0xff000000-0xff000fff irq 11 at device 13.0 on pci0 > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > sa0 at ahc0 bus 0 target 4 lun 0 > sa0: Removable Sequential Access SCSI-2 device > sa0: 5.000MB/s transfers (5.000MHz, offset 11) > da0 at ahc0 bus 0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled > da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) > da1 at ahc0 bus 0 target 1 lun 0 > da1: Fixed Direct Access SCSI-3 device > da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled > da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) > > The Exabyte, BTW, is external and was inactive during what follows. > The drives are internal on an Adaptec(tm) LVD cable. > > Four file systems, each with their own plexes, provide the mirroring: > > 2 drives: > D d0 State: up Device /dev/da0s2g Avail: 0/16739 MB (0%) > D d1 State: up Device /dev/da1s2g Avail: 0/16739 MB (0%) > > 4 volumes: > V var State: up Plexes: 2 Size: 512 MB > V usr State: up Plexes: 2 Size: 2227 MB > V export State: up Plexes: 2 Size: 5000 MB > V scratch State: up Plexes: 2 Size: 9000 MB > > 8 plexes: > P var.p0 C State: up Subdisks: 1 Size: 512 MB > P usr.p0 C State: up Subdisks: 1 Size: 2227 MB > P export.p0 C State: up Subdisks: 1 Size: 5000 MB > P scratch.p0 C State: up Subdisks: 1 Size: 9000 MB > P var.p1 C State: up Subdisks: 1 Size: 512 MB > P usr.p1 C State: up Subdisks: 1 Size: 2227 MB > P export.p1 C State: up Subdisks: 1 Size: 5000 MB > P scratch.p1 C State: up Subdisks: 1 Size: 9000 MB > > 8 subdisks: > S var.p0.s0 State: up PO: 0 B Size: 512 MB > S usr.p0.s0 State: up PO: 0 B Size: 2227 MB > S export.p0.s0 State: up PO: 0 B Size: 5000 MB > S scratch.p0.s0 State: up PO: 0 B Size: 9000 MB > S var.p1.s0 State: up PO: 0 B Size: 512 MB > S usr.p1.s0 State: up PO: 0 B Size: 2227 MB > S export.p1.s0 State: up PO: 0 B Size: 5000 MB > S scratch.p1.s0 State: up PO: 0 B Size: 9000 MB > > The other partitions on the drives are also the same layout; equal > sized NT slice, and root and swap space. da0 was the actual boot drive. > > In the past week, I've started seeing the following: > > (da0:ahc0:0:0:0): Invalidating pack > fatal :scratch.p1.s0 write error, block 16769125 for 1024 bytes > scratch.p1.s0: user buffer block 919388 for 1024 bytes > > The failure occurs at random places on the da0 in diffent subdisks, > but always da0, perhaps a few times a day. Restarting the drives > via vinum start in single user mode after a reboot works 98% of the > time. > > Once the failure happens, it also reports problems in swap space. > I've never seen the swap space go down the tubes first. > > It never failed on a reboot from da0. > > The drive was closest to the controller on the cable, for grins I > moved both drives to spare connectors on the cable and the problem > did not move. > > (Addition SCSI state was kicked out to the console, but does not > appear in the log. This info complains of timeouts.) > > It's intermittment, sufficiently so that I include among > my suspect causes a software timing problem. I was able to copy > the entire drive to a ST318406LW which I swapped in, and now the > suspect drive is pulled. Seagate diagnostics, run on a Dell P/III > with a different 29160, show no problems on the drive (not surprising, > given the intermittment nature.) > > How do I track this down? (Clearly, if the new drive never fails, > it is the old drive, but how do I prove that to Seagate?) > > -ahd- > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-stable" in the body of the message > -- ____________________________________________________ Message scanned for viruses and dangerous content by and believed to be clean To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message