From owner-freebsd-stable@FreeBSD.ORG Tue Sep 14 03:14:08 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 683F616A4CE; Tue, 14 Sep 2004 03:14:08 +0000 (GMT) Received: from mesozoic.gatenby.org (mesozoic.gatenby.org [65.19.178.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 23FE843D53; Tue, 14 Sep 2004 03:14:08 +0000 (GMT) (envelope-from eric@gatenby.org) Received: from mesozoic.gatenby.org (localhost [127.0.0.1]) i8E3E7NO010461 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 13 Sep 2004 23:14:07 -0400 Received: (from www-data@localhost) by mesozoic.gatenby.org (8.12.11/8.12.11/Debian-5) id i8E3E7mf010460; Mon, 13 Sep 2004 23:14:07 -0400 X-Authentication-Warning: mesozoic.gatenby.org: www-data set sender to eric@gatenby.org using -f Received: from itsb149.itsnpt.com (itsb149.itsnpt.com [208.48.228.149]) by webmail.gatenby.org (IMP) with HTTP for ; Mon, 13 Sep 2004 23:14:06 -0400 Message-ID: <1095131646.414661fef24e5@webmail.gatenby.org> Date: Mon, 13 Sep 2004 23:14:06 -0400 From: Eric Gatenby To: freebsd-stable@freebsd.org, freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.2.5 X-Originating-IP: 208.48.228.149 Subject: ATA/SATA lockups X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Sep 2004 03:14:08 -0000 Hi, I'm been having a strange problem with various builds of FreeBSD - 5.2-RELEASE (at least) and up to 5.3-BETA4. Random hard lockups are occuring when writing to two separate SATA drives. Sometimes the lockups occur under high IO, but not always. Due to the random nature of the lockups, I don't have much hard evidence and information to provide. How can I go about gathering more information? I've tried enabling WITNESS and other kernel debugging options, but no extra debugging data was produced. The drives aren't configured as RAID -- they are accessed separately and not configured in any special way. They are two 160G Seagate SATA (ST3160023AS) drives that are being accessed via their ar* devices. I've also tried accessing them directly via their ad* devices, but the lockups still occured. smartmontools report the drives as good on both long and short tests. Any suggestions would be appreciated. Thanks! ------------------------------------------------------------------- dmesg output: Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-BETA4 #0: Sat Sep 11 13:12:26 EDT 2004 root@triassic.gatenby.org:/build/obj/build/src/sys/GENERIC Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Pentium(R) 4 CPU 2.53GHz (2539.10-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf27 Stepping = 7 Features=0xbfebfbff real memory = 1073659904 (1023 MB) avail memory = 1041092608 (992 MB) ACPI APIC Table: ioapic0 irqs 0-23 on motherboard npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Overriding SCI Interrupt from IRQ 9 to IRQ 22 acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0 cpu0: on acpi0 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 agp0: mem 0xf8000000-0xfbffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 0.0 (no driver attached) pcib2: at device 30.0 on pci0 pci2: on pcib2 atapci0: port 0xa000-0xa07f,0xa400-0xa40f,0xa800-0xa83f mem 0xdd800000-0xdd81ffff,0xde000000-0xde000fff irq 23 at device 4.0 on pci2 atapci0: failed: rid 0x20 is memory, requested 4 ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 ata4: channel #2 on atapci0 bge0: mem 0xdd000000-0xdd00ffff irq 20 at device 5.0 on pci2 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:e0:18:fe:24:8b isab0: at device 31.0 on pci0 isa0: on isab0 atapci1: port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 irq 18 at device 31.1 on pci0 ata0: channel #0 on atapci1 ata1: channel #1 on atapci1 fdc0: port 0x3f7,0x3f2-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: <1440-KB 3.5" drive> on fdc0 drive 0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] orm0: at iomem 0xc0000-0xcffff on isa0 pmtimer0 on isa0 ppc0: parallel port not found. sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0 sio0: type 8250 or not responding sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 2539104508 Hz quality 800 Timecounters tick every 10.000 msec acpi_cpu: throttling enabled, 8 steps (100% to 12.5%), currently 100.0% ad0: 117246MB [238216/16/63] at ata0-master UDMA100 ad4: 152627MB [310101/16/63] at ata2-master SATA150 ad6: 152627MB [310101/16/63] at ata3-master SATA150 ar0: 152627MB [19457/255/63] status: READY subdisks: disk0 READY on ad4 at ata2-master ar1: 152627MB [19457/255/63] status: READY subdisks: disk0 READY on ad6 at ata3-master Mounting root from ufs:/dev/ad0s1a Accounting enabled ------------------------------------------------------------------- fdisk output for the first drive. second drive is exactly the same: ******* Working on device /dev/ar0 ******* parameters extracted from in-core disklabel are: cylinders=19457 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=19457 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 63, size 312576642 (152625 Meg), flag 80 (active) beg: cyl 0/ head 1/ sector 1; end: cyl 1023/ head 254/ sector 63 The data for partition 2 is: The data for partition 3 is: The data for partition 4 is: ------------------------------------------------------------------- smartctl output: === START OF INFORMATION SECTION === Device Model: ST3160023AS Serial Number: 3JS325HD Firmware Version: 3.18 Device is: In smartctl database [for details use: -P show] ATA Version is: 6 ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2 Local Time is: Mon Sep 13 23:07:54 2004 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled -- Eric Gatenby - eric@gatenby.org - AIM: egatenby http://eric.gatenby.org/ Doubt of the reality of love ends by making us doubt everything. -- Henri-Frédéric Amiel