From owner-freebsd-stable@FreeBSD.ORG  Mon Aug  8 21:29:04 2005
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
X-Original-To: freebsd-stable@FreeBSD.org
Delivered-To: freebsd-stable@FreeBSD.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8498816A41F
	for <freebsd-stable@FreeBSD.org>; Mon,  8 Aug 2005 21:29:04 +0000 (GMT)
	(envelope-from jkim@FreeBSD.org)
Received: from anuket.mj.niksun.com (gwnew.niksun.com [65.115.46.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EE18B43D5A
	for <freebsd-stable@FreeBSD.org>; Mon,  8 Aug 2005 21:29:01 +0000 (GMT)
	(envelope-from jkim@FreeBSD.org)
Received: from niksun.com (anuket [10.70.0.5])
	by anuket.mj.niksun.com (8.13.1/8.13.1) with ESMTP id j78LWA4R002474;
	Mon, 8 Aug 2005 17:32:10 -0400 (EDT) (envelope-from jkim@FreeBSD.org)
From: Jung-uk Kim <jkim@FreeBSD.org>
To: freebsd-stable@FreeBSD.org
Date: Mon, 8 Aug 2005 17:28:38 -0400
User-Agent: KMail/1.6.2
References: <42F7C8AF.8060608@pragmeta.com>
In-Reply-To: <42F7C8AF.8060608@pragmeta.com>
MIME-Version: 1.0
Content-Disposition: inline
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <200508081728.41755.jkim@FreeBSD.org>
X-Virus-Scanned: ClamAV 0.85.1/1008/Sun Aug 7 18:59:27 2005 on
	anuket.mj.niksun.com
X-Virus-Status: Clean
Cc: Josh Endries <jendries@pragmeta.com>
Subject: Re: twa0 errors and system lockup on amd64
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>, 
	<mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
	<mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Aug 2005 21:29:04 -0000

On Monday 08 August 2005 05:03 pm, Josh Endries wrote:
> Hello,
>
> Just a little while ago I got this on a test 5.4-stable dual
> Opteron box I'm setting up (9500S-LP RAID 5 with a hot spare):
>
> Aug  8 15:58:21 kernel: twa0: ERROR: (0x05: 0x210b): Request timed
> out!: request = 0xffffffff80a67700
> Aug  8 15:58:21 kernel: twa0: INFO: (0x16: 0x1108): Resetting
> controller...:
> Aug  8 15:58:21 kernel: twa0: ERROR: (0x15: 0x110b): Can't drain
> AEN queue after reset: error = 60
> Aug  8 15:58:21 kernel: twa0: ERROR: (0x16: 0x1105): Controller
> reset failed: error = 60; attempt 1
>
> It attempted twice and then just sat there after that. I couldn't
> log in at all so I did a hard reset after probably 30+ minutes. I
> didn't find much online other than driver source code or twe(4) man
> pages, which suggests that it's a problem between the driver and
> card. Has anyone else seen this problem? Is it a sign of a flaky
> card or could it be something else? Maybe it's something to do with
> AMD64? This system was supposed to go into production tomorrow. I
> guess it's good that it died today...

I have seen it with 9500S-8, which is the same controller with 8 
ports.  In fact, I am seeing other problems.

twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5
twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0
twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
twa0: INFO: (0x04: 0x003b): Rebuild paused: unit=0
twa0: INFO: (0x04: 0x000b): Rebuild started: unit=0
twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
twa0: INFO: (0x04: 0x0005): Rebuild completed: unit=0
twa0: ERROR: (0x04: 0x0009): Drive timeout detected: port=5
twa0: ERROR: (0x04: 0x0002): Degraded unit: unit=0, port=5

I have rebuilt this array many times but it's happening again and 
again.  It seems this controller/driver has issues with amd64.  FYI, 
UP kernel or replacing cables didn't fix the problem.

Good luck,

Jung-uk Kim

> Josh