From owner-svn-src-head@FreeBSD.ORG Sat Jul 30 16:37:27 2011 Return-Path: Delivered-To: svn-src-head@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C322F1065673; Sat, 30 Jul 2011 16:37:27 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 5C6418FC15; Sat, 30 Jul 2011 16:37:27 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p6UGbNXV060606 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 30 Jul 2011 19:37:23 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p6UGbNXK026978; Sat, 30 Jul 2011 19:37:23 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p6UGbNGf026977; Sat, 30 Jul 2011 19:37:23 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Sat, 30 Jul 2011 19:37:23 +0300 From: Kostik Belousov To: Alexander Motin Message-ID: <20110730163723.GZ17489@deviant.kiev.zoral.com.ua> References: <201107292030.p6TKUSaf064895@svn.freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="lzwOWbZ6TxNmVMlX" Content-Disposition: inline In-Reply-To: <201107292030.p6TKUSaf064895@svn.freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r224496 - head/sys/cam X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 30 Jul 2011 16:37:28 -0000 --lzwOWbZ6TxNmVMlX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jul 29, 2011 at 08:30:28PM +0000, Alexander Motin wrote: > Author: mav > Date: Fri Jul 29 20:30:28 2011 > New Revision: 224496 > URL: http://svn.freebsd.org/changeset/base/224496 >=20 > Log: > In some cases failed SATA disks may report their presence, but don't > respond to any commands. I've found that because of multiple command > retries, each of which cause 30s timeout, bus reset and another retry or > requeue for many commands, it may take ages to eventually drop the > failed device. The odd thing is that those retries continue even after > XPT considered device as dead and invalidated it. > =20 > This patch makes cam_periph_error() to block any command retries after > periph was marked as invalid. With that patch all activity completes in > 1-2 minutes, just after several timeouts, required to consider device > death. This should make ZFS, gmirror, graid, etc. operation more robust. > =20 > Reviewed by: mjacob@ on scsi@ > =20 > Approved by: re (kib) >=20 > Modified: > head/sys/cam/cam_periph.c Amusingly, this commit makes my test machine to not boot. This is Ibex Peak PCH, with two SATA disks on the channels 0 and 1. It seems that geom thread 100012 owns GEOM topology lock, while sleeping in adaclose->cam_periph_getccb() : db> bt 100012 Tracing pid 12 tid 100012 td 0xfffffe00028a2000 sched_switch() at 0xffffffff8034a0c7 =3D sched_switch+0x157 mi_switch() at 0xffffffff803291fb =3D mi_switch+0x2eb sleepq_switch() at 0xffffffff803631f3 =3D sleepq_switch+0x123 sleepq_wait() at 0xffffffff80363eed =3D sleepq_wait+0x4d _sleep() at 0xffffffff80329b59 =3D _sleep+0x3b9 cam_periph_getccb() at 0xffffffff817ffc50 =3D cam_periph_getccb+0xa0 adaclose() at 0xffffffff8182c484 =3D adaclose+0xc4 g_disk_access() at 0xffffffff802bea74 =3D g_disk_access+0x1e4 g_access() at 0xffffffff802c519a =3D g_access+0x1ba g_dev_attrchanged() at 0xffffffff802bd1f6 =3D g_dev_attrchanged+0x96 g_dev_taste() at 0xffffffff802bd574 =3D g_dev_taste+0x284 g_new_provider_event() at 0xffffffff802c4ecd =3D g_new_provider_event+0xad g_run_events() at 0xffffffff802c0750 =3D g_run_events+0x250 fork_exit() at 0xffffffff802f0d99 =3D fork_exit+0x189 fork_trampoline() at 0xffffffff804ee3be =3D fork_trampoline+0xe --- trap 0, rip =3D 0, rsp =3D 0xffffff800025fd00, rbp =3D 0 --- (gdb) list *cam_periph_getccb+0xa0 0x1c50 is in cam_periph_getccb (/usr/home/kostik/work/build/bsd/DEV/src/sys= /modules/cam/../../cam/cam_periph.c:883). 882 883 while (SLIST_FIRST(&periph->ccb_list) =3D=3D NULL) { 884 if (periph->immediate_priority > priority) Reverting the rev. or not loading ahci.ko allows machine to boot. --lzwOWbZ6TxNmVMlX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk40M0MACgkQC3+MBN1Mb4hnowCfbdZicpeUrXDM+DM/ZVC38XNf 0EIAoIqCgEzxKP0tz9QkLpKKr4Y+/zBk =C6T3 -----END PGP SIGNATURE----- --lzwOWbZ6TxNmVMlX--