From owner-freebsd-stable@FreeBSD.ORG Mon Aug 8 21:29:04 2005 Return-Path: X-Original-To: freebsd-stable@FreeBSD.org Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8498816A41F for ; Mon, 8 Aug 2005 21:29:04 +0000 (GMT) (envelope-from jkim@FreeBSD.org) Received: from anuket.mj.niksun.com (gwnew.niksun.com [65.115.46.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id EE18B43D5A for ; Mon, 8 Aug 2005 21:29:01 +0000 (GMT) (envelope-from jkim@FreeBSD.org) Received: from niksun.com (anuket [10.70.0.5]) by anuket.mj.niksun.com (8.13.1/8.13.1) with ESMTP id j78LWA4R002474; Mon, 8 Aug 2005 17:32:10 -0400 (EDT) (envelope-from jkim@FreeBSD.org) From: Jung-uk Kim To: freebsd-stable@FreeBSD.org Date: Mon, 8 Aug 2005 17:28:38 -0400 User-Agent: KMail/1.6.2 References: <42F7C8AF.8060608@pragmeta.com> In-Reply-To: <42F7C8AF.8060608@pragmeta.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200508081728.41755.jkim@FreeBSD.org> X-Virus-Scanned: ClamAV 0.85.1/1008/Sun Aug 7 18:59:27 2005 on anuket.mj.niksun.com X-Virus-Status: Clean Cc: Josh Endries Subject: Re: twa0 errors and system lockup on amd64 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 Aug 2005 21:29:04 -0000 On Monday 08 August 2005 05:03 pm, Josh Endries wrote: > Hello, > > Just a little while ago I got this on a test 5.4-stable dual > Opteron box I'm setting up (9500S-LP RAID 5 with a hot spare): > > Aug 8 15:58:21 kernel: twa0: ERROR: (0x05: 0x210b): Request timed > out!: request = 0xffffffff80a67700 > Aug 8 15:58:21 kernel: twa0: INFO: (0x16: 0x1108): Resetting > controller...: > Aug 8 15:58:21 kernel: twa0: ERROR: (0x15: 0x110b): Can't drain > AEN queue after reset: error = 60 > Aug 8 15:58:21 kernel: twa0: ERROR: (0x16: 0x1105): Controller > reset failed: error = 60; attempt 1 > > It attempted twice and then just sat there after that. I couldn't > log in at all so I did a hard reset after probably 30+ minutes. I > didn't find much online other than driver source code or twe(4) man > pages, which suggests that it's a problem between the driver and > card. Has anyone else seen this problem? Is it a sign of a flaky > card or could it be something else? Maybe it's something to do with > AMD64? This system was supposed to go into production tomorrow. I > guess it's good that it died today... I have seen it with 9500S-8, which is the same controller with 8 ports. In fact, I am seeing other problems. twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0 twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0 twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5 twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0 twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5 twa0: INFO: (0x04: 0x0005): Rebuild completed: unit=0 twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5 twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5 I have rebuilt this array many times but it's happening again and again. It seems this controller/driver has issues with amd64. FYI, UP kernel or replacing cables didn't fix the problem. Good luck, Jung-uk Kim > Josh