From owner-freebsd-stable@FreeBSD.ORG Fri Jul 14 19:02:32 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7A67D16A4DA for ; Fri, 14 Jul 2006 19:02:32 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from retezat.tovarna.cz (retezat.tovarna.cz [88.86.106.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id AADDC43D5D for ; Fri, 14 Jul 2006 19:02:25 +0000 (GMT) (envelope-from 000.fbsd@quip.cz) Received: from localhost (localhost.tovarna.cz [127.0.0.1]) by retezat.tovarna.cz (Postfix) with ESMTP id 0DF8FF1850 for ; Fri, 14 Jul 2006 21:00:25 +0200 (CEST) Received: from [192.168.1.2] (unknown [85.132.172.19]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by retezat.tovarna.cz (Postfix) with ESMTP id F0B94F210A for ; Fri, 14 Jul 2006 21:00:20 +0200 (CEST) Message-ID: <44B7EA39.4060509@quip.cz> Date: Fri, 14 Jul 2006 21:02:17 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 X-Accept-Language: cs, cz, en, en-us MIME-Version: 1.0 To: freebsd-stable@freebsd.org References: <8D08DDB6-6AC1-45B6-B2CE-08782F54968A@stromnet.org> <884C01BC-3E97-46EC-AA8B-E70C3931F3A4@stromnet.org> <36895211-2796-4213-B336-6279AB3AC3CB@stromnet.org> <20060713132357.Y61840@fledge.watson.org> In-Reply-To: <20060713132357.Y61840@fledge.watson.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: ATA problems again ... X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 14 Jul 2006 19:02:32 -0000 Robert Watson wrote: > I don't have a whole lot to add to this thread, but have changed the > subject to make sure that the right people are reading this. This is > likely either a hardware problem (motherboard/cable/drive) or driver > problem. GEOM and the mirror driver seems to be behaving as desired (it > detaches a drive reported by the driver as being bad). Could you post > the dmesg -v output for the probing of the ata controller and driver? Same problem here first (ad4) or second (ad8) disk disappear from the system about once a day. Independent of disk / CPU load. Sometimes without any load, today when I was stress testing the disks with copying /usr/ports to another slice in cycle - after 3 hours I got: Jul 14 19:05:45 track kernel: ad8: FAILURE - device detached Jul 14 19:05:45 track kernel: subdisk8: detached Jul 14 19:05:45 track kernel: ad8: detached Jul 14 19:05:45 track kernel: GEOM_MIRROR: Device gm0: provider ad8 disconnected. Jul 14 19:05:45 track kernel: g_vfs_done():mirror/gm0s1h[READ(offset=6345932800, length=65536)]error = 6 Jul 14 19:05:45 track kernel: vnode_pager_getpages: I/O read error Jul 14 19:05:45 track kernel: vm_fault: pager read error, pid 5108 (cp) After reboot (command reboot), system boot up with both disks attached and start autosynchronization. I do not know, if this is hw or sw error, I got two same machines with almost equal SW setup and realy equal HW setup, but this errors ocurres on one of them only. dmesg.boot before ad8 failure (rebuilding ad4 from previous failure): Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 6.1-RELEASE #0: Sun May 7 04:42:56 UTC 2006 root@opus.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (3000.12-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf43 Stepping = 3 Features=0xbfebfbff Features2=0x649d> AMD Features=0x20100000 Logical CPUs per core: 2 real memory = 1073610752 (1023 MB) avail memory = 1041489920 (993 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0: Changing APIC ID to 2 ioapic1: Changing APIC ID to 3 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR acpi0: Power Button (fixed) acpi_bus_number: can't get _ADR acpi_bus_number: can't get _ADR Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 acpi_throttle0: on cpu0 cpu1: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 16 at device 28.0 on pci0 pci1: on pcib1 pcib2: at device 0.0 on pci1 pci2: on pcib2 pcib3: irq 16 at device 28.4 on pci0 pci3: on pcib3 bge0: mem 0xfc8f0000-0xfc8fffff irq 16 at device 0.0 on pci3 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:15:f2:ec:43:69 pcib4: irq 17 at device 28.5 on pci0 pci4: on pcib4 bge1: mem 0xfc9f0000-0xfc9fffff irq 17 at device 0.0 on pci4 miibus1: on bge1 brgphy1: on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:15:f2:ec:43:6a uhci0: port 0xec00-0xec1f irq 23 at device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: port 0xe880-0xe89f irq 19 at device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered ehci0: mem 0xfebffc00-0xfebfffff irq 23 at device 29.7 on pci0 ehci0: [GIANT-LOCKED] usb2: EHCI version 1.0 usb2: companion controllers, 2 ports each: usb0 usb1 usb2: on ehci0 usb2: USB revision 2.0 uhub2: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub2: 4 ports with 4 removable, self powered pcib5: at device 30.0 on pci0 pci5: on pcib5 pci5: at device 2.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 31.1 on pci0 ata0: on atapci0 ata1: on atapci0 atapci1: port 0xe800-0xe807,0xe480-0xe483,0xe400-0xe407,0xe080-0xe083,0xe000-0xe00f mem 0xfebff800-0xfebffbff irq 19 at device 31.2 on pci0 ata2: on atapci1 ata3: on atapci1 ata4: on atapci1 ata5: on atapci1 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] pmtimer0 on isa0 orm0: at iomem 0xc0000-0xc7fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: DVDROM at ata0-slave UDMA100 ad4: 238475MB at ata2-master SATA150 GEOM_MIRROR: Device gm0 created (id=565164480). GEOM_MIRROR: Device gm0: provider ad4 detected. ad8: 238475MB at ata4-master SATA150 SMP: AP CPU #1 Launched! GEOM_MIRROR: Device gm0: provider ad8 detected. GEOM_MIRROR: Device gm0: provider ad8 activated. GEOM_MIRROR: Device gm0: provider mirror/gm0 launched. GEOM_MIRROR: Device gm0: rebuilding provider ad4. Trying to mount root from ufs:/dev/mirror/gm0s1a bge0: link state changed to UP uname: FreeBSD 6.1-RELEASE #0: Sun May 7 04:42:56 UTC 2006 root@opus.cse.buffalo.edu:/usr/obj/usr/src/sys/SMP i386 Miroslav Lachman