From owner-freebsd-geom@FreeBSD.ORG Mon Mar 14 18:47:16 2005 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 79FF216A4CE for ; Mon, 14 Mar 2005 18:47:16 +0000 (GMT) Received: from jupiter.datacompusa.com (host-64-179-57-99.gra.choiceone.net [64.179.57.99]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3176243D3F for ; Mon, 14 Mar 2005 18:47:11 +0000 (GMT) (envelope-from msb@datacompusa.com) Received: from [192.168.1.100] (msb.datacomp-intranet.com [192.168.1.100]) j2EIlAgQ049728 for ; Mon, 14 Mar 2005 13:47:10 -0500 (EST) (envelope-from msb@datacompusa.com) Mime-Version: 1.0 (Apple Message framework v619.2) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: freebsd-geom@freebsd.org From: Michael Boers Date: Mon, 14 Mar 2005 13:46:15 -0500 X-Mailer: Apple Mail (2.619.2) Subject: gmirrored boot drives locks up during buildworld X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Mar 2005 18:47:16 -0000 I recently installed FreeBSD 5.3 on a machine to be my primary mysql server. The machine failed after about 3 weeks of heavy use. The machine did not panic, it just froze and some random characters appeared on the console. A reboot restored the system for another few weeks. On the third failure I took it out of production. The machine consists of a Intel Pentium 4 EE HT with a pair of 80 gigabyte IDE gmirrored boot drives and a pair of 250 gigabyte IDE gmirror data drives. With the machine out of production, I used while (true) do make clean; make buildworld; done to exercise the machine until it failed. Usually within three days. I swapped video cards, memory, hard drives, and played with bios settings to no avail. Finally I determined that when I ran without using gmirror, the machine would build indefinitely. Finally, I tried the buildworld test on a completely different (amd vs intel, scsi vs ide disks) machine and it failed in less than 3 hours. Because the system freezes rather than panics, I have no diagnostic information to provide. If this is a possible gmirror bug, please let me know if there is any other information I can provide. I am very interested in using gmirror but I want to make sure it is safe. Please feel free to call me at the below number if necessary. -- Michael Boers Datacomp 877-406-0231 Details on First Machine uname output: FreeBSD saturn.datacomp-intranet.com 5.3-RELEASE FreeBSD 5.3-RELEASE #0: Fri Mar 11 13:37:46 EST 2005 msb@saturn.datacomp-intranet.com:/usr/src/sys/i386/compile/SATURN i386 dmesg boot info: Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-RELEASE #0: Fri Mar 11 13:37:46 EST 2005 msb@saturn.datacomp-intranet.com:/usr/src/sys/i386/compile/SATURN ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.40GHz (3400.13-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf25 Stepping = 5 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 1073676288 (1023 MB) avail memory = 1045311488 (996 MB) ioapic0 irqs 0-23 on motherboard npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xf8000000-0xfbffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) uhci0: port 0xcc00-0xcc1f irq 16 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xd000-0xd01f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: port 0xd400-0xd41f irq 18 at device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: on uhci2 usb2: USB revision 1.0 uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhci3: port 0xd800-0xd81f irq 16 at device 29.3 on pci0 uhci3: [GIANT-LOCKED] usb3: on uhci3 usb3: USB revision 1.0 uhub3: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered pci0: at device 29.7 (no driver attached) pcib2: at device 30.0 on pci0 pci2: on pcib2 em0: port 0xbc00-0xbc3f mem 0xfeaa0000-0xfeabffff,0xfeac0000-0xfeadffff irq 19 at device 3.0 on pci2 em0: Ethernet address: 00:0e:0c:33:2c:96 em0: Speed:N/A Duplex:N/A re0: port 0xb800-0xb8ff mem 0xfeafff00-0xfeafffff irq 20 at device 6.0 on pci2 miibus0: on re0 rgephy0: on miibus0 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto re0: Ethernet address: 00:11:09:81:40:a9 isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0xfc00-0xfc0f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 atapci1: port 0xdc00-0xdc0f,0xe000-0xe003,0xe400-0xe407,0xe800-0xe803,0xec00-0xec07 irq 18 at device 31.2 on pci0 ata2: channel #0 on atapci1 ata3: channel #1 on atapci1 pci0: at device 31.3 (no driver attached) pci0: at device 31.5 (no driver attached) atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0: port 0x778-0x77b,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 lpt0: on ppbus0 lpt0: Interrupt-driven port pmtimer0 on isa0 orm0: at iomem 0xe0000-0xe0fff,0xcf800-0xd07ff,0xc0000-0xcf7ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 3400132108 Hz quality 800 Timecounters tick every 10.000 msec ipfw2 initialized, divert disabled, rule-based forwarding disabled, default to deny, logging limited to 1 packets/entry by default acpi_cpu: throttling enabled, 8 steps (100% to 12.5%), currently 100.0% ad0: 76319MB [155061/16/63] at ata0-master UDMA100 ad1: 76319MB [155061/16/63] at ata0-slave UDMA100 acd0: CDROM at ata1-master PIO4 ad4: 238475MB [484521/16/63] at ata2-master SATA150 GEOM_MIRROR: Device boot created (id=1429322453). GEOM_MIRROR: Device boot: provider ad0 detected. ad6: 238475MB [484521/16/63] at ata3-master SATA150 GEOM_MIRROR: Device boot: provider ad1 detected. GEOM_MIRROR: Device boot: provider ad1 activated. GEOM_MIRROR: Device boot: provider mirror/boot launched. GEOM_MIRROR: Device boot: rebuilding provider ad0. GEOM_MIRROR: Device databases created (id=59908152). GEOM_MIRROR: Device databases: provider ad4 detected. GEOM_MIRROR: Device databases: provider ad6 detected. GEOM_MIRROR: Device databases: provider ad6 activated. GEOM_MIRROR: Device databases: provider mirror/databases launched. GEOM_MIRROR: Device databases: rebuilding provider ad4. Mounting root from ufs:/dev/mirror/boota em0: Link is up 100 Mbps Full Duplex