From owner-freebsd-stable@FreeBSD.ORG Wed Mar 22 13:06:09 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6551F16A400 for ; Wed, 22 Mar 2006 13:06:09 +0000 (UTC) (envelope-from maho@kronos.NIC.DTAG.DE) Received: from limes.NIC.DTAG.DE (limes.NIC.DTAG.DE [194.25.1.113]) by mx1.FreeBSD.org (Postfix) with ESMTP id A398243D45 for ; Wed, 22 Mar 2006 13:06:08 +0000 (GMT) (envelope-from maho@kronos.NIC.DTAG.DE) Received: from kronos.NIC.DTAG.DE (kronos.NIC.DTAG.DE [194.25.1.92]) by limes.NIC.DTAG.DE (8.8.5/8.8.3) with ESMTP id OAA13632 for ; Wed, 22 Mar 2006 14:06:03 +0100 (MET) Received: from x55.NIC.DTAG.DE (x55.NIC.DTAG.DE [194.25.1.180]) by kronos.NIC.DTAG.DE (8.8.5/8.7.1) with ESMTP id OAA27147 for ; Wed, 22 Mar 2006 14:06:05 +0100 (MET) Received: (from maho@localhost) by x55.NIC.DTAG.DE (8.12.8+Sun/8.12.8/Submit) id k2MD64VC027257 for freebsd-stable@freebsd.org; Wed, 22 Mar 2006 14:06:04 +0100 (MET) Date: Wed, 22 Mar 2006 14:06:04 +0100 From: Martin Horneffer To: freebsd-stable@freebsd.org Message-ID: <20060322130604.GA27234@nic.dtag.de> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="y0ulUmNC+osPPQO6" Content-Disposition: inline User-Agent: Mutt/1.4i Subject: "TIMEOUT - WRITE_DMA" with SiI 3512 SATA on IBM eServer 326 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Mar 2006 13:06:09 -0000 --y0ulUmNC+osPPQO6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, I have a problem, probably with the SiI 3512 SATA150 controller in a dual-Opteron IBM eServer 326: Every once a while the kernel issues a message like: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=150190687 The system waits a few seconds and continues to work normally. It typically occurs several times a day most likely depending on the load of the (SATA connected) hard drive. We have two machines of the same hardware configuration, both with two hard drives (identical type). The problem is the same with all the 4 drives on both machines. Thus I assume it's more a driver problem than a bad SATA cable. We are currently using one of the machines with FreeBSD 5-stable (RELENG_5) and the other with some Linux. While Linux didn't have a problem with the hardware, FreeBSD did. We tried 5.4-Release and 6.0-Release both with i386 as well as with amd64. We found that only 5.4-Release on amd64 was able to install, even though with some warning. The other versions failed to install at all. After successful installation we noticed two problems: - After a couple of uptime hours top stopped to report CPU utilization numbers (all 0). This went away by changing the timercounter hardware from ACPI-fast to i8254 (kern.timecounter.hardware=i8254 in /etc/sysctl.conf). - The "TIMEOUT - WRITE_DMA" messages occur from time to time, always stopping the system for a few seconds (probably all processes trying to access the hard drive). So far I didn't manage to solve the latter. I upgraded to 5-stable (RELENG_5) with cvsup (last time today) but the problem is still the same. Besides the occasional hickups the machine runs fine. The SATA controller reports as "SiI 3512A SATALink BIOS Version 4.3.47" during BIOS startup. I'll attach the last dmesg output. Any suggestions? Best regards, Martin -- Dr. Martin Horneffer -- maho@nic.dtag.de Deutsche Telekom AG T-Com Technology Engineering Internet Backbone Architecture --y0ulUmNC+osPPQO6 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="dmesg-2006-03-22.txt" Copyright (c) 1992-2006 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.5-PRERELEASE #1: Mon Mar 20 16:24:38 CET 2006 root@xxxx.NIC.DTAG.DE:/usr/obj/usr/src/sys/XXXX Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Opteron(tm) Processor 248 (2193.17-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Features=0x78bfbff AMD Features=0xe0500800 real memory = 2146893824 (2047 MB) avail memory = 2063441920 (1967 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-27 on motherboard ioapic2 irqs 28-31 on motherboard acpi0: on motherboard acpi0: Power Button (fixed) unknown: I/O range not supported unknown: I/O range not supported Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 cpu0: on acpi0 powernow0: on cpu0 device_attach: powernow0 attach returned 6 cpu1: on acpi0 powernow1: on cpu1 device_attach: powernow1 attach returned 6 acpi_button0: on acpi0 pcib0: port 0x8080-0x80ff,0x8000-0x807f,0xcf8-0xcff iomem 0xd8000-0xdbfff on acpi0 pci0: on pcib0 pcib1: at device 6.0 on pci0 pci1: on pcib1 ohci0: mem 0xfc100000-0xfc100fff irq 19 at device 0.0 on pci1 usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 3 ports with 3 removable, self powered ohci1: mem 0xfc101000-0xfc101fff irq 19 at device 0.1 on pci1 usb1: OHCI version 1.0, legacy support usb1: on ohci1 usb1: USB revision 1.0 uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 3 ports with 3 removable, self powered pci1: at device 5.0 (no driver attached) atapci0: port 0x2400-0x240f,0x2410-0x2413,0x2418-0x241f,0x2414-0x2417,0x2420-0x2427 mem 0xfc103000-0xfc1031ff irq 17 at device 6.0 on pci1 ata2: channel #0 on atapci0 ata3: channel #1 on atapci0 isab0: at device 7.0 on pci0 isa0: on isab0 atapci1: port 0x1020-0x102f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 7.1 on pci0 ata0: channel #0 on atapci1 ata1: channel #1 on atapci1 pci0: at device 7.2 (no driver attached) pci0: at device 7.3 (no driver attached) pcib2: at device 10.0 on pci0 pci2: on pcib2 bge0: mem 0xfe000000-0xfe00ffff,0xfe010000-0xfe01ffff irq 24 at device 1.0 on pci2 miibus0: on bge0 brgphy0: on miibus0 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge0: Ethernet address: 00:11:25:1e:23:a4 bge1: mem 0xfe020000-0xfe02ffff,0xfe030000-0xfe03ffff irq 25 at device 1.1 on pci2 miibus1: on bge1 brgphy1: on miibus1 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 1000baseTX-FDX, auto bge1: Ethernet address: 00:11:25:1e:23:a5 pci0: at device 10.1 (no driver attached) pcib3: at device 11.0 on pci0 pci3: on pcib3 pci0: at device 11.1 (no driver attached) sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A, console orm0: at iomem 0xcb000-0xcf7ff,0xc9800-0xcafff,0xc8000-0xc97ff,0xc0000-0xc7fff on isa0 atkbdc0: at port 0x64,0x60 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x100> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 1.000 msec acd0: CDROM at ata1-master PIO4 ad4: 76324MB [155072/16/63] at ata2-master SATA150 ad6: 76324MB [155072/16/63] at ata3-master SATA150 SMP: AP CPU #1 Launched! Mounting root from ufs:/dev/ad4s1a ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=8319 --y0ulUmNC+osPPQO6--