Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Jun 2004 20:42:32 -0400
From:      Paul Mather <paul@gromit.dlib.vt.edu>
To:        freebsd-current@freebsd.org
Subject:   ATA RAID rebuild not rebuilding
Message-ID:  <1087173751.697.70.camel@zappa.Chelsea-Ct.Org>

next in thread | raw e-mail | index | archive | help
I have a recent 5.2-CURRENT system (last built 9th June, 2004) on which
I thought I'd try a RAID 1 mirror using ATA RAID on the built-in Intel
'82371AB/EB/MB PIIX4/4E/4M IDE Controller.'

In summary, I can build and use an ar0 mirror, but whenever I try to
simulate a failure and reconstruction (via atacontrol detach followed by
an atacontrol attach, as described in section 12.4.3 of the FreeBSD
Handbook) the subsequent "atacontrol rebuild ar0" does nothing (it
returns to the prompt immediately) and the ar0 array is still flagged as
DEGRADED (and the detached/attached drive as DOWN).  It appears
impossible to revive the array.

Has anyone successfully managed to fail and replace a drive in such a
setup?  If so, how?

Also, what is the canonical way to construct an array on a built-in
controller?  I have two identical ATA drives: ad1 on ATA channel 0, and
ad2 on ATA channel 1.  Assuming these are both unpartitioned/empty, do I
first have to fdisk the individual drives or is it enough to "atacontrol
create RAID1 ad1 ad2" and then "fdisk -BI ar0" and bsdlabel ar0s1?

Oddly enough, when I boot after a simulated failure, both ad1 and ad2
are probed correctly (and appear under /dev), though ATA RAID declares
"no device found for this disk":

ar0: 24405MB <ATA RAID1 array> [3111/255/63] status: DEGRADED subdisks:
 disk0 READY on ad1 at ata0-slave
 disk1 DOWN no device found for this disk

It seems the only way to get the array to recognise the disk again is to
delete the entire array and create it again. :-(

Where is the ATA RAID metadata stored?  How is consistency maintained? 
Is there some way of forcing a reconstruction onto a particular drive?

In a verbose boot, the array configuration is output.  For some reason,
it is flagged as PROMISE.  Shouldn't this be FREEBSD?  Might that be
confusing the rebuild process, or is rebuilding done identically both on
FreeBSD ATA RAID and Promise ATA RAID controllers?

Here is the pciconf -vl output for the onboard ATA controller:

atapci0@pci0:7:1:       class=0x010180 card=0x00000000 chip=0x71118086
rev=0x01 hdr=0x00
    vendor   = 'Intel Corporation'
    device   = '82371AB/EB/MB PIIX4/4E/4M IDE Controller'
    class    = mass storage
    subclass = ATA

and here is the relevant output from a verbose boot after array failure:

FreeBSD 5.2-CURRENT #0: Wed Jun  9 09:14:59 EDT 2004
    paul@zappa.Chelsea-Ct.Org:/usr/obj/usr/src/sys/ZAPPA
[[...]]
CPU: Pentium II/Pentium II Xeon/Celeron (300.68-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x634  Stepping = 4
 
Features=0x80f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,MMX
>
[[...]]
acpi0: <AWARD AWRDACPI> on motherboard
acpi0: [GIANT-LOCKED]
[[...]]
atapci0: <Intel PIIX4 UDMA33 controller> port
0xf000-0xf00f,0x376,0x170-0x177,0x
3f6,0x1f0-0x1f7 at device 7.1 on pci0
atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xf000
atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0
atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6
ata0: reset tp1 mask=03 ostat0=50 ostat1=50
ata0-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata0-slave:  stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata0: reset tp2 mask=03 stat0=50 stat1=50
devices=0x3<ATA_SLAVE,ATA_MASTER>
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
atapci0: Reserved 0x8 bytes for rid 0x18 type 4 at 0x170
atapci0: Reserved 0x1 bytes for rid 0x1c type 4 at 0x376
ata1: reset tp1 mask=03 ostat0=50 ostat1=50
ata1-master: stat=0x50 err=0x01 lsb=0x00 msb=0x00
ata1-slave:  stat=0x10 err=0x01 lsb=0x14 msb=0xeb
ata1: reset tp2 mask=03 stat0=50 stat1=10
devices=0x9<ATAPI_SLAVE,ATA_MASTER>
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
[[...]]
ata0-slave: pio=0x0c wdma=0x22 udma=0x44 cable=80pin
ata0-master: pio=0x0c wdma=0x22 udma=0x44 cable=80pin
ata0-master: setting PIO4 on Intel PIIX4 chip
ata0-master: setting UDMA33 on Intel PIIX4 chip
ata0-slave: setting PIO4 on Intel PIIX4 chip
ata0-slave: setting UDMA33 on Intel PIIX4 chip
ad0: <Maxtor 91366U4/RA530JN0> ATA-5 disk at ata0-master
ad0: 12982MB (26588016 sectors), 26377 C, 16 H, 63 S, 512 B
ad0: 16 secs/int, 1 depth queue, UDMA33
GEOM: new disk ad0
ar: FreeBSD check1 failed
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):1023/15/63 s:63 l:26587953
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad0s1, start 32256 length 13613031936 end 13613064191
ad1: <IBM-DJNA-352500/J51OA30K> ATA-4 disk at ata0-slave
ad1: 24405MB (49981680 sectors), 49585 C, 16 H, 63 S, 512 B
ad1: 16 secs/int, 1 depth queue, UDMA33
GEOM: Configure ad0s1a, start 0 length 268435456 end 268435455
GEOM: Configure ad0s1b, start 268435456 length 447438848 end 715874303
GEOM: Configure ad0s1c, start 0 length 13613031936 end 13613031935
GEOM: Configure ad0s1d, start 715874304 length 268435456 end 984309759
GEOM: Configure ad0s1e, start 984309760 length 12628722176 end
13613031935
GEOM: new disk ad1
ata1-slave: pio=0x0c wdma=0x22 udma=0xffffffff cable=40pin
ata1-master: pio=0x0c wdma=0x22 udma=0x44 cable=80pin
ata1-master: setting PIO4 on Intel PIIX4 chip
ata1-master: setting UDMA33 on Intel PIIX4 chip
ata1-slave: setting PIO4 on Intel PIIX4 chip
ata1-slave: setting WDMA2 on Intel PIIX4 chip
ad2: <IBM-DJNA-352500/J51OA30K> ATA-4 disk at ata1-master
ad2: 24405MB (49981680 sectors), 49585 C, 16 H, 63 S, 512 B
ad2: 16 secs/int, 1 depth queue, UDMA33
acd0: <CRW6206A/1.2A> CDRW drive at ata1 as slave
acd0: read 344KB/s (1034KB/s) write 344KB/s (344KB/s), 384KB buffer,
WDMA2
acd0: Reads: CDR, CDRW, CDDA stream, packet
acd0: Writes: CDR, CDRW, test write
acd0: Audio: play, 128 volume levels
acd0: Mechanism: ejectable tray, unlocked, lock protected
acd0: Medium: no/blank disc
lun             0
magic_0         0x00000000
magic_1         0x00000000
flags           0x5302 5302<PROMISE,DEGRADED,READY,RAID1,>
total_disks 2
generation      2
width           1
heads           255
sectors         63
cylinders       3111
total_sectors   49981617
interleave      -2147483648
reserved        63
offset          0
disk 0: flags = 0x0b b<ONLINE,ASSIGNED,PRESENT>
        ad1
        sectors 49981617
disk 1: flags = 0x02 2<ASSIGNED>
        sectors 0
ar0: 24405MB <ATA RAID1 array> [3111/255/63] status: DEGRADED subdisks:
 disk0 READY on ad1 at ata0-slave
 disk1 DOWN no device found for this disk
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad1s1, start 32256 length 25588813824 end 25588846079
GEOM: new disk ad2
GEOM: new disk ar0
GEOM: Configure ad1s1a, start 8192 length 25588805632 end 25588813823
GEOM: Configure ad1s1c, start 0 length 25588813824 end 25588813823
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ad2s1, start 32256 length 25588813824 end 25588846079
[0] f:80 typ:165 s(CHS):0/1/1 e(CHS):38/254/63 s:63 l:49978152
[1] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[2] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
[3] f:00 typ:0 s(CHS):0/0/0 e(CHS):0/0/0 s:0 l:0
GEOM: Configure ar0s1, start 32256 length 25588813824 end 25588846079
GEOM: Configure ad2s1a, start 8192 length 25588805632 end 25588813823
GEOM: Configure ad2s1c, start 0 length 25588813824 end 25588813823
GEOM: Configure ar0s1a, start 8192 length 25588805632 end 25588813823
GEOM: Configure ar0s1c, start 0 length 25588813824 end 25588813823
(probe2:ata1:0:0:0): error 22
(probe2:ata1:0:0:0): Unretryable Error
(probe2:ata1:0:0:0): error 22
(probe2:ata1:0:0:0): Unretryable Error
(probe0:ata0:0:0:0): error 22
(probe0:ata0:0:0:0): Unretryable Error
(probe1:ata0:0:1:0): error 22
(probe1:ata0:0:1:0): Unretryable Error
(probe0:ata0:0:0:0): error 22
(probe0:ata0:0:0:0): Unretryable Error
(probe1:ata0:0:1:0): error 22
(probe1:ata0:0:1:0): Unretryable Error
(probe3:ata1:0:1:0): error 6
(probe3:ata1:0:1:0): Unretryable Error
pass0 at ata1 bus 0 target 1 lun 0
pass0: <ATAPI CD-R/RW CRW6206A 1.2A> Removable CD-ROM SCSI-0 device 
pass0: Serial Number \^_
pass0: 16.000MB/s transfers
GEOM: new disk cd0
(cd0:ata1:0:1:0): error 6
(cd0:ata1:0:1:0): Unretryable Error
cd0 at ata1 bus 0 target 1 lun 0
cd0: <ATAPI CD-R/RW CRW6206A 1.2A> Removable CD-ROM SCSI-0 device 
cd0: Serial Number \^_
cd0: 16.000MB/s transfers
cd0: Attempt to query device size failed: NOT READY, Medium not present
(cd0:ata1:0:1:0): error 6
(cd0:ata1:0:1:0): Unretryable Error
(cd0:ata1:0:1:0): error 6
(cd0:ata1:0:1:0): Unretryable Error
Mounting root from ufs:/dev/ad0s1a
start_init: trying /sbin/init



I use RAIDframe on my NetBSD/alpha system, and I'd very much like to use
ATA RAID on my FreeBSD/i386 5.2-CURRENT box, but only if I can convince
myself I can reconstruct in the event of a drive failure.

(I use Vinum on my FreeBSD 4.10-STABLE box.  Perhaps I can try that
instead, now the GEOM-ified version is in the tree.  Is it possible to
boot a system with the root partition on a Vinum volume with the GEOM
version?)

Cheers,

Paul.
-- 
e-mail: paul@gromit.dlib.vt.edu

"Without music to decorate it, time is just a bunch of boring production
 deadlines or dates by which bills must be paid."
        --- Frank Vincent Zappa



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1087173751.697.70.camel>