Date: Mon, 16 Sep 2002 08:24:10 -0400 (EDT) From: ahd@kew.com (Drew Derbyshire) To: stable@freebsd.org Subject: vinum / 4.6.2 / mirrored drives Message-ID: <20020916122410.9C5B8BA14@pandora.hh.kew.com>
next in thread | raw e-mail | index | archive | help
Short version: Are there known issues with vinum used for mirroring SCSI drives under 4.6.2? Long version ... I have a Dell Gx1/PII 350 with mirrored SCSI via vinum... FreeBSD pandora.hh.kew.com 4.6.2-RELEASE FreeBSD 4.6.2-RELEASE #16: Fri Aug 16 23:23:07 EDT 2002 ahd@pandora.hh.kew .com:/usr/scratch/obj/usr/src/sys/DELL_GX1 i386 ahc0: <Adaptec 29160N Ultra160 SCSI adapter> port 0xc800-0xc8ff mem 0xff000000-0xff000fff irq 11 at device 13.0 on pci0 aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs sa0 at ahc0 bus 0 target 4 lun 0 sa0: <EXABYTE EXB-8505 0051> Removable Sequential Access SCSI-2 device sa0: 5.000MB/s transfers (5.000MHz, offset 11) da0 at ahc0 bus 0 target 0 lun 0 da0: <SEAGATE ST318405LW 5063> Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da0: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) da1 at ahc0 bus 0 target 1 lun 0 da1: <SEAGATE ST318405LW 5063> Fixed Direct Access SCSI-3 device da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing Enabled da1: 17501MB (35843670 512 byte sectors: 255H 63S/T 2231C) The Exabyte, BTW, is external and was inactive during what follows. The drives are internal on an Adaptec(tm) LVD cable. Four file systems, each with their own plexes, provide the mirroring: 2 drives: D d0 State: up Device /dev/da0s2g Avail: 0/16739 MB (0%) D d1 State: up Device /dev/da1s2g Avail: 0/16739 MB (0%) 4 volumes: V var State: up Plexes: 2 Size: 512 MB V usr State: up Plexes: 2 Size: 2227 MB V export State: up Plexes: 2 Size: 5000 MB V scratch State: up Plexes: 2 Size: 9000 MB 8 plexes: P var.p0 C State: up Subdisks: 1 Size: 512 MB P usr.p0 C State: up Subdisks: 1 Size: 2227 MB P export.p0 C State: up Subdisks: 1 Size: 5000 MB P scratch.p0 C State: up Subdisks: 1 Size: 9000 MB P var.p1 C State: up Subdisks: 1 Size: 512 MB P usr.p1 C State: up Subdisks: 1 Size: 2227 MB P export.p1 C State: up Subdisks: 1 Size: 5000 MB P scratch.p1 C State: up Subdisks: 1 Size: 9000 MB 8 subdisks: S var.p0.s0 State: up PO: 0 B Size: 512 MB S usr.p0.s0 State: up PO: 0 B Size: 2227 MB S export.p0.s0 State: up PO: 0 B Size: 5000 MB S scratch.p0.s0 State: up PO: 0 B Size: 9000 MB S var.p1.s0 State: up PO: 0 B Size: 512 MB S usr.p1.s0 State: up PO: 0 B Size: 2227 MB S export.p1.s0 State: up PO: 0 B Size: 5000 MB S scratch.p1.s0 State: up PO: 0 B Size: 9000 MB The other partitions on the drives are also the same layout; equal sized NT slice, and root and swap space. da0 was the actual boot drive. In the past week, I've started seeing the following: (da0:ahc0:0:0:0): Invalidating pack fatal :scratch.p1.s0 write error, block 16769125 for 1024 bytes scratch.p1.s0: user buffer block 919388 for 1024 bytes The failure occurs at random places on the da0 in diffent subdisks, but always da0, perhaps a few times a day. Restarting the drives via vinum start in single user mode after a reboot works 98% of the time. Once the failure happens, it also reports problems in swap space. I've never seen the swap space go down the tubes first. It never failed on a reboot from da0. The drive was closest to the controller on the cable, for grins I moved both drives to spare connectors on the cable and the problem did not move. (Addition SCSI state was kicked out to the console, but does not appear in the log. This info complains of timeouts.) It's <expletive> intermittment, sufficiently so that I include among my suspect causes a software timing problem. I was able to copy the entire drive to a ST318406LW which I swapped in, and now the suspect drive is pulled. Seagate diagnostics, run on a Dell P/III with a different 29160, show no problems on the drive (not surprising, given the intermittment nature.) How do I track this down? (Clearly, if the new drive never fails, it is the old drive, but how do I prove that to Seagate?) -ahd- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20020916122410.9C5B8BA14>