From owner-freebsd-stable@FreeBSD.ORG Fri Feb 26 11:55:38 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1003F106564A for ; Fri, 26 Feb 2010 11:55:38 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id E439A8FC08 for ; Fri, 26 Feb 2010 11:55:37 +0000 (UTC) Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76]) by qmta03.emeryville.ca.mail.comcast.net with comcast id mbuQ1d0041eYJf8A3bved8; Fri, 26 Feb 2010 11:55:38 +0000 Received: from koitsu.dyndns.org ([98.248.46.159]) by omta19.emeryville.ca.mail.comcast.net with comcast id mbvd1d0093S48mS01bvdM7; Fri, 26 Feb 2010 11:55:38 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 262761E301A; Fri, 26 Feb 2010 03:55:36 -0800 (PST) Date: Fri, 26 Feb 2010 03:55:36 -0800 From: Jeremy Chadwick To: Torfinn Ingolfsen Message-ID: <20100226115536.GA17798@icarus.home.lan> References: <20100131144217.ca08e965.torfinn.ingolfsen@broadpark.no> <20100131175639.86ba9aee.torfinn.ingolfsen@broadpark.no> <20100207163631.da7205fc.torfinn.ingolfsen@broadpark.no> <20100213192404.5e15b5eb.torfinn.ingolfsen@broadpark.no> <20100217091625.d0e74570.torfinn.ingolfsen@broadpark.no> <20100220202108.e1dd1b74.torfinn.ingolfsen@broadpark.no> <20100220193718.GA33214@icarus.home.lan> <20100220224959.c424dd9e.torfinn.ingolfsen@broadpark.no> <20100220233546.GA36973@icarus.home.lan> <20100226110337.70d1a758.torfinn.ingolfsen@broadpark.no> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100226110337.70d1a758.torfinn.ingolfsen@broadpark.no> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: panic - sleeping thread on FreeBSD 8.0-stable / amd64 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Feb 2010 11:55:38 -0000 On Fri, Feb 26, 2010 at 11:03:37AM +0100, Torfinn Ingolfsen wrote: > > What exact disks (e.g. adX) are attached to ata5 and ata6? > > root@kg-f2# dmesg | grep ata5 > ata5: on atapci0 > ata5: [ITHREAD] > ad10: 953869MB at ata5-master UDMA100 SATA 3Gb/s > root@kg-f2# dmesg | grep ata6 > ata6: on atapci0 > ata6: [ITHREAD] > ad12: 953869MB at ata6-master UDMA100 SATA 3Gb/s > ...snip... > No, I didn't. I did state that full dmesg's and more info was available on the freebsd web page[1] for the machine > in one of my first posts. Okay, so the breakdown for those following is: http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131.txt?attredirects=0 atapci0: port 0xff00-0xff07,0xfe00-0xfe03,0xfd00-0xfd07,0xfc00-0xfc03,0xfb00-0xfb0f mem 0xfe02f000-0xfe02f3ff irq 22 at device 17.0 on pci0 atapci0: [ITHREAD] atapci0: AHCI v1.10 controller with 6 3Gbps ports, PM supported ata2: on atapci0 ata3: on atapci0 ata4: on atapci0 ata5: on atapci0 ata6: on atapci0 ata7: on atapci0 ad6: 238475MB at ata3-master UDMA100 SATA 3Gb/s ad8: 953869MB at ata4-master UDMA100 SATA 3Gb/s ad10: 953869MB at ata5-master UDMA100 SATA 3Gb/s ad12: 953869MB at ata6-master UDMA100 SATA 3Gb/s ad14: 953869MB at ata7-master UDMA100 SATA 3Gb/s But the only ports which are having issues are ata5 and ata6, which hosts disks ad10 and ad12 respectively. SMART stats for ad10 and ad12 look fantastic, aside from slightly long spin-up times (claiming over 8 seconds), but that wouldn't cause what's seen here. Both disks have used for nearly 1700 hours. No SMART error log entries exist on either disk, which means the timeouts seen when speaking to the controller are very likely when talking to the controller itself (and not when waiting for the controller to submit a request to the disk and that piece stalling). I'm out of ideas aside from the following: 1) Disabling MSI/MSIX, which at this point I'm doubting will fix anything (but you never know), since I'd expect it to affect the entire controller and not just specific ports on the controller. 2) Replacing the SATA cables used between ata5<-->ad10 and ata6<-->ad12. 3) Getting mav@ to talk to AMD to find out if there's any AHCI quirks in the IXP700 or IXP800 SATA controllers, as there could be some weird driver bug/quirk on FreeBSD which is needed. Mainly for mav@: verbose boot messages for this system are here, in case any SATA register details are of help: http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131_verb1.txt?attredirects=0 http://sites.google.com/site/tingox/f2-dmesg-8.0-stable-20100131_verb2.txt?attredirects=0 -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |