From owner-freebsd-questions@FreeBSD.ORG Fri Mar 31 22:06:17 2006 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 928AB16A41F for ; Fri, 31 Mar 2006 22:06:17 +0000 (UTC) (envelope-from wil@hyperconx.com) Received: from artemis.hyperconx.net (artemis.hyperconx.net [66.181.8.155]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2AED143D46 for ; Fri, 31 Mar 2006 22:06:17 +0000 (GMT) (envelope-from wil@hyperconx.com) Received: from adsl-69-225-224-190.dsl.skt2ca.pacbell.net ([69.225.224.190] helo=Production) by artemis.hyperconx.net with esmtpa (Exim 4.54) id 1FPRl2-0008nO-DV for freebsd-questions@freebsd.org; Fri, 31 Mar 2006 14:06:16 -0800 From: "Wil Hatfield - HyperConX" To: Date: Fri, 31 Mar 2006 14:08:16 -0800 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2670 Importance: Normal Subject: ATA Drive Issues X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Mar 2006 22:06:17 -0000 What is the problem with 5.4 and ATA drives? I am running the latest release of FreeBSD 5.4-RELEASE-p11. I have two basic ATA drives, no raids and no scsi anything. Every now and then under a bit of load the harddrive freezes with either a kernel panic or a Write_DMA error. I have to reboot the machine and run fsck -y to recover. Sometimes I have to run it twice. As per several posts that were similar I have the following uneffectively enabled in my loader.conf file. hw.ata.ata_dma=0 hw.ata.atapi_dma=0 However, this hasn't fixed the problem. From the amount of issues similar to mine I am going to take a whack at the fact that I don't think it is strickly a DMA or drive issue. The DMA issue is just the result of a deeper underlying problem. Maybe something in the kernel or drivers. This same issue is relevant for 3 brand new Supermicro machines all running nearly the same Western Digital drives. 4 drives are 200GB WDs and 1 is a 160GB WD. All with brand new cables. Since this is all brand new equipment please don't pass this off as a bad cable. It isn't. As for the drives I have smarttools running on these systems now and there are no bad sectors and the drive health is all clean. Absolutely no issues as reported by smarttools. No changes in any of the attributes at all. Here is some more info: --dmesg.today snippet-- Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.4-RELEASE-p11 #0: Tue Mar 28 17:18:36 PST 2006 wilh@hera.xxxxxxxxx.net:/usr/obj/usr/src/sys/CUSTOM-KERNEL Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 3.20GHz (3200.13-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf41 Stepping = 1 Features=0xbfebfbff Hyperthreading: 2 logical CPUs real memory = 2146893824 (2047 MB) avail memory = 2099638272 (2002 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 6 cpu3 (AP): APIC ID: 7 ioapic0 irqs 0-23 on motherboard ioapic1 irqs 24-47 on motherboard ioapic2 irqs 48-71 on motherboard ioapic3 irqs 72-95 on motherboard ioapic4 irqs 96-119 on motherboard npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard acpi0: Power Button (fixed) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.1 (no driver attached) pci0: at device 1.0 (no driver attached) pcib1: irq 16 at device 2.0 on pci0 pci1: on pcib1 pcib2: irq 16 at device 3.0 on pci0 pci2: on pcib2 pcib3: at device 0.0 on pci2 pci3: on pcib3 pci2: at device 0.1 (no driver attached) pcib4: at device 0.2 on pci2 pci4: on pcib4 em0: port 0x2000-0x203f mem 0xdd200000-0xdd21ffff irq 54 at device 2.0 on pci4 em0: Ethernet address: 00:30:48:2c:c3:80 em0: Speed:N/A Duplex:N/A em1: port 0x2040-0x207f mem 0xdd220000-0xdd23ffff irq 55 at device 2.1 on pci4 em1: Ethernet address: 00:30:48:2c:c3:81 em1: Speed:N/A Duplex:N/A pci2: at device 0.3 (no driver attached) pcib5: irq 16 at device 4.0 on pci0 pci5: on pcib5 pcib6: at device 0.0 on pci5 pci6: on pcib6 pci5: at device 0.1 (no driver attached) pcib7: at device 0.2 on pci5 pci7: on pcib7 pci5: at device 0.3 (no driver attached) pcib8: irq 16 at device 6.0 on pci0 pci8: on pcib8 pci0: at device 29.0 (no driver attached) pci0: at device 29.1 (no driver attached) pci0: at device 29.2 (no driver attached) pci0: at device 29.3 (no driver attached) pci0: at device 29.7 (no driver attached) pcib9: at device 30.0 on pci0 pci9: on pcib9 pci9: at device 1.0 (no driver attached) isab0: at device 31.0 on pci0 isa0: on isab0 atapci0: port 0x14a0-0x14af,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 pci0: at device 31.3 (no driver attached) acpi_button0: on acpi0 atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fd0: <1440-KB 3.5" drive> on fdc0 drive 0 pmtimer0 on isa0 orm0: at iomem 0xc8000-0xc8fff,0xc0000-0xc7fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounters tick every 10.000 msec IP Filter: v3.4.35 initialized. Default = block all, Logging = enabled ipfw2 initialized, divert disabled, rule-based forwarding disabled, default to deny, logging unlimited ad0: 190782MB [387621/16/63] at ata0-master PIO4 ad1: 190782MB [387621/16/63] at ata0-slave PIO4 acd0: CDROM at ata1-master PIO4 SMP: AP CPU #2 Launched! SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! Mounting root from ufs:/dev/ad0s1a em0: Link is up 100 Mbps Full Duplex Let me know if anyone wants more info. Any help or insight that anyone can provide would be great. These machines went are production as of just recently and these issues didn't appear until put under some load. So basically I am now screwed. HELP! Cheers, -- Wil Hatfield