From owner-freebsd-current@FreeBSD.ORG Fri Oct 30 03:49:36 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F702106566C; Fri, 30 Oct 2009 03:49:36 +0000 (UTC) (envelope-from james-freebsd-current@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id C968B8FC21; Fri, 30 Oct 2009 03:49:35 +0000 (UTC) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n9U3nXwf040791; Thu, 29 Oct 2009 22:49:34 -0500 (CDT) (envelope-from james-freebsd-current@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-current@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: content-type:content-transfer-encoding; b=Rn7jRehyjtewWF/wsJv4kvz0/NxXlVv4rsQqTeS8IRVoENR5uNylrveWBiD4t+unD 0kS/So79ztzTWkH8gH8LVpUwbrBGmaFKkx3hiNuS4OS+Dp/AUeyJLDGKnCobVXOJNjJ MAW2pWujrHTxFUiOON4twt4v8t4N8c3CB/ujgVM= Message-ID: <4AEA624D.10602@jrv.org> Date: Thu, 29 Oct 2009 22:49:33 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: FreeBSD Current Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Alexander Motin Subject: CAM/SIIS CAM_CMD_TIMEOUT hangs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Oct 2009 03:49:36 -0000 I have problems with I/O hanging due to CAM/SIIS not handling the CAM_CMD_TIMEOUT error. Hangs happen every few hundred GB to every few TB. The disks are behind SATA port multipliers. This command un-hangs the drive and lets things run again: # camcontrol reset all I assume this is sending a soft reset to the disk drive but haven't checked yet. Is there a way xpt_done() or such might notice a CAM_CMD_TIMEOUT and inject a "soft reset" request at the head of the I/O queue (to run before the timed-out command retries)?