From owner-freebsd-stable@FreeBSD.ORG Sun Jun 6 18:55:54 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 90B321065678 for ; Sun, 6 Jun 2010 18:55:54 +0000 (UTC) (envelope-from lambert@lambertfam.org) Received: from sysmon.tcworks.net (sysmon.tcworks.net [65.66.76.4]) by mx1.freebsd.org (Postfix) with ESMTP id 4732B8FC24 for ; Sun, 6 Jun 2010 18:55:53 +0000 (UTC) Received: from sysmon.tcworks.net (localhost [127.0.0.1]) by sysmon.tcworks.net (8.13.1/8.13.1) with ESMTP id o56Itr1w074284; Sun, 6 Jun 2010 13:55:53 -0500 (CDT) (envelope-from lambert@lambertfam.org) Received: (from lambert@localhost) by sysmon.tcworks.net (8.13.1/8.13.1/Submit) id o56Itp9u074282; Sun, 6 Jun 2010 13:55:51 -0500 (CDT) (envelope-from lambert@lambertfam.org) X-Authentication-Warning: sysmon.tcworks.net: lambert set sender to lambert@lambertfam.org using -f Date: Sun, 6 Jun 2010 13:55:51 -0500 From: Scott Lambert To: freebsd-stable@freebsd.org Message-ID: <20100606185551.GA267@sysmon.tcworks.net> Mail-Followup-To: freebsd-stable@freebsd.org, Edwin Groothuis References: <20100606052509.GA4744@mavetju.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100606052509.GA4744@mavetju.org> User-Agent: Mutt/1.4.2.2i Cc: Edwin Groothuis Subject: Re: gmirror refused to connect second disk after a reboot X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: freebsd-stable@freebsd.org List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jun 2010 18:55:54 -0000 On Sun, Jun 06, 2010 at 03:25:09PM +1000, Edwin Groothuis wrote: > For two years I've had a happy gmirror RAID1 system. And a week or > three ago I was found a degraded system due to a broken disk. > > I tried to replace the disk, first with one three sectors too small > which didn't want to be entered in the array (as excepted), then > with a same brand/type one which I added without a problem. Rebuilding, > everything okay. > > [~] edwin@k7>sudo fdisk -s /dev/ad1 > /dev/ad1: 1938021 cyl 16 hd 63 sec > Part Start Size Type Flags > 1: 63 1953520002 0xa5 0x00 > [~] edwin@k7>sudo fdisk -s /dev/ad3 > /dev/ad3: 1938021 cyl 16 hd 63 sec > Part Start Size Type Flags > 1: 63 1953520002 0xa5 0x80 > > [~] edwin@k7>gmirror status > Name Status Components > mirror/gm0 COMPLETE ad1 > ad3 > > > Until after a reboot, then GEOM complains about: > > GEOM: ad3s1: geometry does not match label (255h,63s != 16h,63s). > GEOM_MIRROR: Force device gm0 start due to timeout. > GEOM_MIRROR: Device mirror/gm0 launched (1/2). > > [~] edwin@k7>gmirror status > Name Status Components > mirror/gm0 DEGRADED ad1 > > Forgetting and re-inserting the ad3 does attach it again and rebuild > everything, until the next reboot. I have one dual PIII machine doing the same to me. I've been assuming my issue is with the ATA controller. But, in case it helps, here is the interesting information from my box. FreeBSD netmon.tcworks.net 7.2-STABLE FreeBSD 7.2-STABLE #2: Fri Dec 4 14:52:34 CST 2009 root@netmon.tcworks.net:/usr/obj/usr/src/sys/GENERIC i386 CPU: Intel(R) Pentium(R) III CPU family 1133MHz (1129.76-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6b1 Stepping = 1 Features=0x383fbff real memory = 2147483648 (2048 MB) Physical memory chunk(s): 0x0000000000001000 - 0x000000000009efff, 647168 bytes (158 pages) 0x0000000000100000 - 0x00000000003fffff, 3145728 bytes (768 pages) 0x0000000001025000 - 0x000000007dbaafff, 2092457984 bytes (510854 pages) avail memory = 2091831296 (1994 MB) atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 15.1 on pci0 atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xffa0 ata0: on atapci0 atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0x1f0 atapci0: Reserved 0x1 bytes for rid 0x14 type 4 at 0x3f6 ata0: reset tp1 mask=03 ostat0=50 ostat1=50 ata0: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: reset tp2 stat0=50 stat1=00 devices=0x9 ioapic0: routing intpin 14 (ISA IRQ 14) to vector 50 ata0: [MPSAFE] ata0: [ITHREAD] ata1: on atapci0 atapci0: Reserved 0x8 bytes for rid 0x18 type 4 at 0x170 atapci0: Reserved 0x1 bytes for rid 0x1c type 4 at 0x376 ata1: reset tp1 mask=03 ostat0=50 ostat1=00 ata1: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 ata1: stat1=0x00 err=0x01 lsb=0x00 msb=0x00 ata1: reset tp2 stat0=50 stat1=00 devices=0x1 ioapic0: routing intpin 15 (ISA IRQ 15) to vector 51 ata1: [MPSAFE] ata1: [ITHREAD] ata0-slave: pio=PIO4 wdma=WDMA2 udma=UNSUPPORTED cable=40 wire ata0-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad0: setting PIO4 on ROSB4 chip ad0: setting UDMA33 on ROSB4 chip ad0: 238475MB at ata0-master UDMA33 ad0: 488397168 sectors [484521C/16H/63S] 16 sectors/interrupt 1 depth queue ad0: Adaptec check1 failed ad0: LSI (v3) check1 failed ad0: LSI (v2) check1 failed ad0: FreeBSD check1 failed acd0: setting PIO4 on ROSB4 chip acd0: CDROM drive at ata0 as slave acd0: 128KB buffer, PIO4 acd0: Reads: CDR, CDRW, CDDA stream, packet acd0: Writes: acd0: Audio: play, 255 volume levels acd0: Mechanism: ejectable tray, unlocked acd0: Medium: no/blank disc ata1-master: pio=PIO4 wdma=WDMA2 udma=UDMA100 cable=80 wire ad2: setting PIO4 on ROSB4 chip ad2: setting UDMA33 on ROSB4 chip ad2: 238475MB at ata1-master UDMA33 ad2: 488397168 sectors [484521C/16H/63S] 16 sectors/interrupt 1 depth queue ad2: Adaptec check1 failed ad2: LSI (v3) check1 failed ad2: LSI (v2) check1 failed ad2: FreeBSD check1 failed ATA PseudoRAID loaded SMP: AP CPU #1 Launched! cpu1 AP: ID: 0x01000000 VER: 0x00040011 LDR: 0x00000000 DFR: 0xffffffff lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff timer: 0x000200ef therm: 0x00000000 err: 0x00010000 pcm: 0x00010400 ioapic0: Assigning ISA IRQ 1 to local APIC 0 ioapic0: Assigning ISA IRQ 3 to local APIC 1 ioapic0: Assigning ISA IRQ 4 to local APIC 0 ioapic0: Assigning ISA IRQ 6 to local APIC 1 ioapic0: Assigning PCI IRQ 10 to local APIC 0 ioapic0: Assigning ISA IRQ 14 to local APIC 1 ioapic0: Assigning ISA IRQ 15 to local APIC 0 ioapic1: Assigning PCI IRQ 20 to local APIC 1 ioapic1: Assigning PCI IRQ 21 to local APIC 0 GEOM: new disk ad0 GEOM: new disk ad2 GEOM_MIRROR: Device mirror/gm0 launched (2/2). Trying to mount root from ufs:/dev/mirror/gm0s1a start_init: trying /sbin/init 13:41:25 Sun Jun 06 $ bzcat messages.2.bz2 | egrep -i "geom|mirror|ad[[:digit:]]" Dec 11 02:01:34 netmon kernel: Preloaded elf module "/boot/kernel/geom_mirror.ko" at 0xc0e2e174. Dec 11 02:01:34 netmon kernel: pnpbios: handle 10 device ID PNP0a03 (030ad041) Dec 11 02:01:34 netmon kernel: ad0: setting PIO4 on ROSB4 chip Dec 11 02:01:34 netmon kernel: ad0: setting UDMA33 on ROSB4 chip Dec 11 02:01:34 netmon kernel: ad0: 238475MB at ata0-master UDMA33 Dec 11 02:01:34 netmon kernel: ad0: 488397168 sectors [484521C/16H/63S] 16 sectors/interrupt 1 depth queue Dec 11 02:01:34 netmon kernel: ad0: Adaptec check1 failed Dec 11 02:01:34 netmon kernel: ad0: LSI (v3) check1 failed Dec 11 02:01:34 netmon kernel: ad0: LSI (v2) check1 failed Dec 11 02:01:34 netmon kernel: ad0: FreeBSD check1 failed Dec 11 02:01:34 netmon kernel: ad2: setting PIO4 on ROSB4 chip Dec 11 02:01:34 netmon kernel: ad2: setting UDMA33 on ROSB4 chip Dec 11 02:01:34 netmon kernel: ad2: 238475MB at ata1-master UDMA33 Dec 11 02:01:34 netmon kernel: ad2: 488397168 sectors [484521C/16H/63S] 16 sectors/interrupt 1 depth queue Dec 11 02:01:34 netmon kernel: ad2: Adaptec check1 failed Dec 11 02:01:34 netmon kernel: ad2: LSI (v3) check1 failed Dec 11 02:01:34 netmon kernel: ad2: LSI (v2) check1 failed Dec 11 02:01:34 netmon kernel: ad2: FreeBSD check1 failed Dec 11 02:01:34 netmon kernel: GEOM: new disk ad0 Dec 11 02:01:34 netmon kernel: GEOM: new disk ad2 Dec 11 02:01:34 netmon kernel: GEOM_MIRROR: Device mirror/gm0 launched (2/2). Dec 11 02:01:34 netmon kernel: Trying to mount root from ufs:/dev/mirror/gm0s1a Dec 11 02:01:48 netmon kernel: ad2: setting PIO4 on ROSB4 chip Dec 11 02:01:48 netmon kernel: ad2: setting UDMA33 on ROSB4 chip Dec 11 02:01:48 netmon kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=232068607 Dec 11 02:02:00 netmon kernel: ad2: setting PIO4 on ROSB4 chip Dec 11 02:02:00 netmon kernel: ad2: setting UDMA33 on ROSB4 chip Dec 11 02:02:00 netmon kernel: ad2: TIMEOUT - READ_DMA retrying (1 retry left) LBA=232766751 Dec 11 02:02:10 netmon kernel: ad0: setting PIO4 on ROSB4 chip Dec 11 02:02:10 netmon kernel: ad0: setting UDMA33 on ROSB4 chip Dec 11 02:02:10 netmon kernel: ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=232006207 Dec 11 02:02:36 netmon kernel: ad0: setting PIO4 on ROSB4 chip Dec 11 02:02:36 netmon kernel: ad0: setting UDMA33 on ROSB4 chip Dec 11 02:02:36 netmon kernel: ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=242232479 Dec 11 02:02:37 netmon kernel: ad2: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242234911 Dec 11 02:02:37 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242235039 Dec 11 02:02:37 netmon kernel: ad2: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242234911 Dec 11 02:02:37 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242235039 Dec 11 02:02:37 netmon kernel: ad2: FAILURE - READ_DMA status=51 error=84 LBA=242234911 Dec 11 02:02:37 netmon kernel: ad0: FAILURE - READ_DMA status=51 error=84 LBA=242235039 Dec 11 02:02:37 netmon kernel: GEOM_MIRROR: Request failed (error=5). ad2[READ(offset=124024274432, length=65536)] Dec 11 02:02:37 netmon kernel: GEOM_MIRROR: Device gm0: provider ad2 disconnected. Dec 11 02:02:37 netmon kernel: GEOM_MIRROR: Request failed (error=5). ad0[READ(offset=124024339968, length=65536)] Dec 11 02:02:37 netmon kernel: g_vfs_done():mirror/gm0s1e[READ(offset=112213082112, length=131072)]error = 5 Dec 11 02:02:47 netmon kernel: ad0: setting PIO4 on ROSB4 chip Dec 11 02:02:47 netmon kernel: ad0: setting UDMA33 on ROSB4 chip Dec 11 02:02:47 netmon kernel: ad0: TIMEOUT - READ_DMA retrying (1 retry left) LBA=242234911 Dec 11 02:02:47 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242235039 Dec 11 02:02:47 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=242235039 Dec 11 02:02:47 netmon kernel: ad0: FAILURE - READ_DMA status=51 error=84 LBA=242235039 Dec 11 02:02:47 netmon kernel: g_vfs_done():mirror/gm0s1e[READ(offset=112213082112, length=131072)]error = 5 Dec 11 02:02:50 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=232478271 Dec 11 02:02:50 netmon kernel: ad0: WARNING - READ_DMA UDMA ICRC error (retrying request) LBA=232478271 Dec 11 02:02:50 netmon kernel: ad0: FAILURE - READ_DMA status=51 error=84 LBA=232478271 Dec 11 02:02:50 netmon kernel: g_vfs_done():mirror/gm0s1e[READ(offset=107217682432, length=131072)]error = 5 I have just left it broken this boot. I was annoyed by this issue and a few others which all happend to bite me on Dec 11th, while I was trying to get everything wrapped up for a vacation. 13:44:39 Sun Jun 06 $ geom mirror list Geom name: gm0 State: DEGRADED Components: 2 Balance: split Slice: 4096 Flags: NONE GenID: 2 SyncID: 1 ID: 3839024964 Providers: 1. Name: mirror/gm0 Mediasize: 250059349504 (233G) Sectorsize: 512 Mode: r8w4e5 Consumers: 1. Name: ad0 Mediasize: 250059350016 (233G) Sectorsize: 512 Mode: r1w1e1 State: ACTIVE Priority: 0 Flags: DIRTY, BROKEN GenID: 2 SyncID: 1 ID: 803371877 I guess I should rebuild it now that you've reminded me. :-) -- Scott Lambert KC5MLE Unix SysAdmin lambert@lambertfam.org