From owner-freebsd-stable@FreeBSD.ORG Sun Feb 21 03:29:46 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A1EC106566C for ; Sun, 21 Feb 2010 03:29:46 +0000 (UTC) (envelope-from cpghost@cordula.ws) Received: from mail-bw0-f216.google.com (mail-bw0-f216.google.com [209.85.218.216]) by mx1.freebsd.org (Postfix) with ESMTP id 003138FC1A for ; Sun, 21 Feb 2010 03:29:45 +0000 (UTC) Received: by bwz8 with SMTP id 8so979843bwz.3 for ; Sat, 20 Feb 2010 19:29:42 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.137.16 with SMTP id u16mr8550051bkt.165.1266721960081; Sat, 20 Feb 2010 19:12:40 -0800 (PST) X-Originating-IP: [213.146.115.42] In-Reply-To: <20100220233546.GA36973@icarus.home.lan> References: <20100131144217.ca08e965.torfinn.ingolfsen@broadpark.no> <20100131175639.86ba9aee.torfinn.ingolfsen@broadpark.no> <20100207163631.da7205fc.torfinn.ingolfsen@broadpark.no> <20100213192404.5e15b5eb.torfinn.ingolfsen@broadpark.no> <20100217091625.d0e74570.torfinn.ingolfsen@broadpark.no> <20100220202108.e1dd1b74.torfinn.ingolfsen@broadpark.no> <20100220193718.GA33214@icarus.home.lan> <20100220224959.c424dd9e.torfinn.ingolfsen@broadpark.no> <20100220233546.GA36973@icarus.home.lan> Date: Sun, 21 Feb 2010 04:12:39 +0100 Message-ID: From: "C. P. Ghost" To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: panic - sleeping thread on FreeBSD 8.0-stable / amd64 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Feb 2010 03:29:46 -0000 On Sun, Feb 21, 2010 at 12:35 AM, Jeremy Chadwick wrote: > We can safely rule out the Silicon Image controller (otherwise "ataX" > wouldn't be involved), which leaves the AMD SB700 SATA controller and > the AMD SB700 PATA controller. > > What exact disks (e.g. adX) are attached to ata5 and ata6? You haven't > provided dmesg output in any of your posts, and atacontrol/pciconf is > not sufficient (I should really improve atacontrol by printing this > information. I'll work on that in a few minutes). > > Some Linux users have reported AHCI-related issues with the SB600 > southbridge, but the core of the problem turned out to be MSI on certain > AMD northbridges (specifically RS480, RS400, and RS200). By disabling > MSI entirely they were able to achieve stability. The FreeBSD > equivalent would be to set the following in loader.conf and reboot: > > hw.pci.enable_msix="0" > hw.pci.enable_msi="0" > > The Linux quirk fix for this: > > > http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob_plain;f=queue-2.6.21/pci-quirks-disable-msi-on-rs400-200-and-rs480.patch;hb=05ab505f2909acf3a614d3e6a32271c4c1f8a69d > > Your board has an AMD 740G northbridge, but it might be worth trying the > MSI disable trick anyway. If it doesn't fix the problem then definitely > re-enable MSI. Isn't hardware fun? ;-) > Just one more data point. I have a machine with similar hardware: an MSI K9A2GM-FIH motherboard: http://eu.msi.com/index.php?func=proddesc&maincat_no=1&prod_no=1436 with an AMD 780G northbridge and AMD SB700 SATA controller, and I experienced freezes after switching to AHCI. Those freezes happened e.g. after some sustained random disk activity, followed by starting 'dvdisaster'. Then the HDD LDD started to blink slooooowwwwly, every 10 seconds on, then off and again. No more disk activity on the aha0 controller was possible. The system remained responsive as long as it didn't involve disk activity (i.e. pings, mouse, keyboard etc.., but not starting new processes). I'm pretty sure it started happening (very sporadically) only after I've switched to an AHCI setup. It didn't freeze before under the same load pattern. I can't test disabling MSI right now on that box, but will try it on a similar test machine in a few days (where I hope to reproduce this). Thanks for the hint! ahci0: port 0xc000-0xc007,0xb000-0xb003,0xa000 -0xa007,0x9000-0x9003,0x8000-0x800f mem 0xfe7ff800-0xfe7ffbff irq 22 at device 1 7.0 on pci0 ahci0: [ITHREAD] ahci0: AHCI v1.10 with 6 3Gbps ports, Port Multiplier supported ahcich0: at channel 0 on ahci0 ahcich0: [ITHREAD] ahcich1: at channel 1 on ahci0 ahcich1: [ITHREAD] ahcich2: at channel 2 on ahci0 ahcich2: [ITHREAD] ahcich3: at channel 3 on ahci0 ahcich3: [ITHREAD] ahcich4: at channel 4 on ahci0 ahcich4: [ITHREAD] ahcich5: at channel 5 on ahci0 ahcich5: [ITHREAD] ACPI Warning: \\_SB_.PCI0.SBRG.FDC_._FDE: Return type mismatch - found Package, expected Buffer 20090521 nspredef-1051 (aprobe0:ahcich0:0:0:0): SIGNATURE: 0000 ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: ATA/ATAPI-7 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO size 8192bytes) ada0: Command Queueing enabled ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C) -cpghost. -- Cordula's Web. http://www.cordula.ws/