Date: Sun, 13 Jun 2004 20:42:32 -0400 From: Paul Mather <paul@gromit.dlib.vt.edu> To: freebsd-current@freebsd.org Subject: ATA RAID rebuild not rebuilding Message-ID: <1087173751.697.70.camel@zappa.Chelsea-Ct.Org>
next in thread | raw e-mail | index | archive | help
I have a recent 5.2-CURRENT system (last built 9th June, 2004) on which I thought I'd try a RAID 1 mirror using ATA RAID on the built-in Intel '82371AB/EB/MB PIIX4/4E/4M IDE Controller.' In summary, I can build and use an ar0 mirror, but whenever I try to simulate a failure and reconstruction (via atacontrol detach followed by an atacontrol attach, as described in section 12.4.3 of the FreeBSD Handbook) the subsequent "atacontrol rebuild ar0" does nothing (it returns to the prompt immediately) and the ar0 array is still flagged as DEGRADED (and the detached/attached drive as DOWN). It appears impossible to revive the array. Has anyone successfully managed to fail and replace a drive in such a setup? If so, how? Also, what is the canonical way to construct an array on a built-in controller? I have two identical ATA drives: ad1 on ATA channel 0, and ad2 on ATA channel 1. Assuming these are both unpartitioned/empty, do I first have to fdisk the individual drives or is it enough to "atacontrol create RAID1 ad1 ad2" and then "fdisk -BI ar0" and bsdlabel ar0s1? Oddly enough, when I boot after a simulated failure, both ad1 and ad2 are probed correctly (and appear under /dev), though ATA RAID declares "no device found for this disk": ar0: 24405MB <ATA RAID1 array> [3111/255/63] status: DEGRADED subdisks: disk0 READY on ad1 at ata0-slave disk1 DOWN no device found for this disk It seems the only way to get the array to recognise the disk again is to delete the entire array and create it again. :-( Where is the ATA RAID metadata stored? How is consistency maintained? Is there some way of forcing a reconstruction onto a particular drive? In a verbose boot, the array configuration is output. For some reason, it is flagged as PROMISE. Shouldn't this be FREEBSD? Might that be confusing the rebuild process, or is rebuilding done identically both on FreeBSD ATA RAID and Promise ATA RAID controllers? Here is the pciconf -vl output for the onboard ATA controller: atapci0@pci0:7:1: class=0x010180 card=0x00000000 chip=0x71118086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82371AB/EB/MB PIIX4/4E/4M IDE Controller' class = mass storage subclass = ATA and here is the relevant output from a verbose boot after array failure: FreeBSD 5.2-CURRENT #0: Wed Jun 9 09:14:59 EDT 2004 paul@zappa.Chelsea-Ct.Org:/usr/obj/usr/src/sys/ZAPPA [[...]] CPU: Pentium II/Pentium II Xeon/Celeron (300.68-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x634 Stepping = 4 Features=0x80f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,MMX > [[...]] acpi0: <AWARD AWRDACPI> on motherboard acpi0: [GIANT-LOCKED] [[...]] atapci0: <Intel PIIX4 UDMA33 controller> port 0xf000-0xf00f,0x376,0x170-0x177,0x 3f6,0x1f0-0x1f7 at device 7.1 on pci0 atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xf000 atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0 atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6 ata0: reset tp1 mask=03 ostat0=50 ostat1=50 ata0-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00 ata0-slave: stat=0x50 err=0x01 lsb=0x00 msb=0x00 ata0: reset tp2 mask=03 stat0=50 stat1=50 devices=0x3<ATA_SLAVE,ATA_MASTER> ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] atapci0: Reserved 0x8 bytes for rid 0x18 type 4 at 0x170 atapci0: Reserved 0x1 bytes for rid 0x1c type 4 at 0x376 ata1: reset tp1 mask=03 ostat0=50 ostat1=50 ata1-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00 ata1-slave: stat=0x10 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 mask=03 stat0=50 stat1=10 devices=0x9<ATAPI_SLAVE,ATA_MASTER> ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] [[...]] ata0-slave: pio=0x0c wdma=0x22 udma=0x44 cable=80pin ata0-master: pio=0x0c wdma=0x22 udma=0x44 cable=80pin ata0-master: setting PIO4 on Intel PIIX4 chip ata0-master: setting UDMA33 on Intel PIIX4 chip ata0-slave: setting PIO4 on Intel PIIX4 chip ata0-slave: setting UDMA33 on Intel PIIX4 chip ad0: <Maxtor 91366U4/RA530JN0> ATA-5 disk at ata0-master ad0: 12982MB (26588016 sectors), 26377 C, 16 H, 63 S, 512 B ad0: 16 secs/int, 1 depth queue, UDMA33 GEOM: new disk ad0 ar: FreeBSD check1 failed [0] f:80 typ:165 s(CHS):0/1/1 e(CHS):1023/15/63 s:63 l:26587953 [1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 GEOM: Configure ad0s1, start 32256 length 13613031936 end 13613064191 ad1: <IBM-DJNA-352500/J51OA30K> ATA-4 disk at ata0-slave ad1: 24405MB (49981680 sectors), 49585 C, 16 H, 63 S, 512 B ad1: 16 secs/int, 1 depth queue, UDMA33 GEOM: Configure ad0s1a, start 0 length 268435456 end 268435455 GEOM: Configure ad0s1b, start 268435456 length 447438848 end 715874303 GEOM: Configure ad0s1c, start 0 length 13613031936 end 13613031935 GEOM: Configure ad0s1d, start 715874304 length 268435456 end 984309759 GEOM: Configure ad0s1e, start 984309760 length 12628722176 end 13613031935 GEOM: new disk ad1 ata1-slave: pio=0x0c wdma=0x22 udma=0xffffffff cable=40pin ata1-master: pio=0x0c wdma=0x22 udma=0x44 cable=80pin ata1-master: setting PIO4 on Intel PIIX4 chip ata1-master: setting UDMA33 on Intel PIIX4 chip ata1-slave: setting PIO4 on Intel PIIX4 chip ata1-slave: setting WDMA2 on Intel PIIX4 chip ad2: <IBM-DJNA-352500/J51OA30K> ATA-4 disk at ata1-master ad2: 24405MB (49981680 sectors), 49585 C, 16 H, 63 S, 512 B ad2: 16 secs/int, 1 depth queue, UDMA33 acd0: <CRW6206A/1.2A> CDRW drive at ata1 as slave acd0: read 344KB/s (1034KB/s) write 344KB/s (344KB/s), 384KB buffer, WDMA2 acd0: Reads: CDR, CDRW, CDDA stream, packet acd0: Writes: CDR, CDRW, test write acd0: Audio: play, 128 volume levels acd0: Mechanism: ejectable tray, unlocked, lock protected acd0: Medium: no/blank disc lun 0 magic_0 0x00000000 magic_1 0x00000000 flags 0x5302 5302<PROMISE,DEGRADED,READY,RAID1,> total_disks 2 generation 2 width 1 heads 255 sectors 63 cylinders 3111 total_sectors 49981617 interleave -2147483648 reserved 63 offset 0 disk 0: flags = 0x0b b<ONLINE,ASSIGNED,PRESENT> ad1 sectors 49981617 disk 1: flags = 0x02 2<ASSIGNED> sectors 0 ar0: 24405MB <ATA RAID1 array> [3111/255/63] status: DEGRADED subdisks: disk0 READY on ad1 at ata0-slave disk1 DOWN no device found for this disk [0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152 [1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 GEOM: Configure ad1s1, start 32256 length 25588813824 end 25588846079 GEOM: new disk ad2 GEOM: new disk ar0 GEOM: Configure ad1s1a, start 8192 length 25588805632 end 25588813823 GEOM: Configure ad1s1c, start 0 length 25588813824 end 25588813823 [0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152 [1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 GEOM: Configure ad2s1, start 32256 length 25588813824 end 25588846079 [0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152 [1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 [3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0 GEOM: Configure ar0s1, start 32256 length 25588813824 end 25588846079 GEOM: Configure ad2s1a, start 8192 length 25588805632 end 25588813823 GEOM: Configure ad2s1c, start 0 length 25588813824 end 25588813823 GEOM: Configure ar0s1a, start 8192 length 25588805632 end 25588813823 GEOM: Configure ar0s1c, start 0 length 25588813824 end 25588813823 (probe2:ata1:0:0:0): error 22 (probe2:ata1:0:0:0): Unretryable Error (probe2:ata1:0:0:0): error 22 (probe2:ata1:0:0:0): Unretryable Error (probe0:ata0:0:0:0): error 22 (probe0:ata0:0:0:0): Unretryable Error (probe1:ata0:0:1:0): error 22 (probe1:ata0:0:1:0): Unretryable Error (probe0:ata0:0:0:0): error 22 (probe0:ata0:0:0:0): Unretryable Error (probe1:ata0:0:1:0): error 22 (probe1:ata0:0:1:0): Unretryable Error (probe3:ata1:0:1:0): error 6 (probe3:ata1:0:1:0): Unretryable Error pass0 at ata1 bus 0 target 1 lun 0 pass0: <ATAPI CD-R/RW CRW6206A 1.2A> Removable CD-ROM SCSI-0 device pass0: Serial Number \^_ pass0: 16.000MB/s transfers GEOM: new disk cd0 (cd0:ata1:0:1:0): error 6 (cd0:ata1:0:1:0): Unretryable Error cd0 at ata1 bus 0 target 1 lun 0 cd0: <ATAPI CD-R/RW CRW6206A 1.2A> Removable CD-ROM SCSI-0 device cd0: Serial Number \^_ cd0: 16.000MB/s transfers cd0: Attempt to query device size failed: NOT READY, Medium not present (cd0:ata1:0:1:0): error 6 (cd0:ata1:0:1:0): Unretryable Error (cd0:ata1:0:1:0): error 6 (cd0:ata1:0:1:0): Unretryable Error Mounting root from ufs:/dev/ad0s1a start_init: trying /sbin/init I use RAIDframe on my NetBSD/alpha system, and I'd very much like to use ATA RAID on my FreeBSD/i386 5.2-CURRENT box, but only if I can convince myself I can reconstruct in the event of a drive failure. (I use Vinum on my FreeBSD 4.10-STABLE box. Perhaps I can try that instead, now the GEOM-ified version is in the tree. Is it possible to boot a system with the root partition on a Vinum volume with the GEOM version?) Cheers, Paul. -- e-mail: paul@gromit.dlib.vt.edu "Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid." --- Frank Vincent Zappa
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1087173751.697.70.camel>