From owner-freebsd-stable@FreeBSD.ORG Sun Feb 7 03:00:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E7F41065672 for ; Sun, 7 Feb 2010 03:00:32 +0000 (UTC) (envelope-from stephane.lapie@darkbsd.org) Received: from quasar.darkbsd.org (shinigami.darkbsd.org [82.227.96.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8C59C8FC14 for ; Sun, 7 Feb 2010 03:00:31 +0000 (UTC) Received: from quasar.darkbsd.org (localhost [127.0.0.1]) by quasar.darkbsd.org (Postfix) with ESMTP id D3D11143E; Sun, 7 Feb 2010 03:44:33 +0100 (CET) Received: from quasar.darkbsd.org ([127.0.0.1]) by quasar.darkbsd.org (quasar.darkbsd.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id x8Ofn3fUFG2n; Sun, 7 Feb 2010 03:44:31 +0100 (CET) Received: from [192.168.3.42] (archer.yomi.darkbsd.org [192.168.3.42]) (Authenticated sender: darksoul) by quasar.darkbsd.org (Postfix) with ESMTPSA id 86A981437; Sun, 7 Feb 2010 03:44:29 +0100 (CET) Message-ID: <4B6E2921.60403@darkbsd.org> Date: Sun, 07 Feb 2010 11:44:49 +0900 From: Stephane LAPIE User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: Andriy Gapon References: <4B682972.6030604@darkbsd.org> <4B682F29.90505@icyb.net.ua> <4B686324.2090308@elischer.org> <4B68641D.9000201@icyb.net.ua> <4B695CA3.50008@darkbsd.org> <4B6B4E2E.2010902@icyb.net.ua> In-Reply-To: <4B6B4E2E.2010902@icyb.net.ua> X-Enigmail-Version: 0.95.6 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigAD01E28EFD635FCFABC45B17" Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, Julian Elischer , freebsd-hardware@freebsd.org Subject: Re: [zfs][hardware] Reproducible kernel panic in 8.0-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Feb 2010 03:00:32 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigAD01E28EFD635FCFABC45B17 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: quoted-printable Andriy Gapon wrote: > on 03/02/2010 13:23 Stephane LAPIE said the following: >> I just rebuilt a kernel with debugger options, and obtained the >> following output upon pulling out one disk : >> >> Sleeping thread (tid 100024, pid 0) owns a non-sleepable lock >> sched_switch() at sched_switch+0xf8 >> mi_switch() at mi_switch+0x16f >> sleepq_timedwait() at sleepq_timedwait+0x42 >> _cv_timedwait() at _cv_timedwait+0x129 >> _sema_timedwait() at _sema_timedwait+0x55 >> ata_queue_request() at ata_queue_request+0x526 >> ata_controlcmd() at ata_controlcmd+0xa1 >> ata_setmode() at ata_setmode+0xdc >> ad_init() at ad_init+0x27 >> ad_reinit() at ad_reinit+0x48 >> ata_reinit() at ata_reinit+0x268 >> ata_conn_event() at ata_conn_event+0x49 >> taskqueue_run() at taskqueue_run+0x93 >> taskqueue_thread_loop() at taskqueue_thread_loop+0x46 >> fork_exit() at fork_exit+0x118 >> fork_trampoline() at fork_trampoline+0xe >> --- trap 0, rip =3D 0, rsp =3D 0xffffff80000aad30, rbp =3D 0 --- >> panic: sleeping thread >> cpuid =3D 2 >> KDB: enter: panic >> [thread pid 12 tid 100008 ] >> Stopped at kdb_enter+0x3d: movq $0,0x4943d0(%rip) >=20 > Not sure if I can derive anything useful from here. > Someone with more expertise is needed. > One thing I noticed is that ata_conn_event and ata_reinit and some othe= r > functions up the stack acquire state_mtx recursively, but the mutex is = not > initialized with MTX_RECURSE. >=20 > Perhaps, indeed you would have a better luck with AHCI controller _and_= ahci(4) > driver. It seems to handle dynamic coming and going of disks much bett= er than > ata(4). I just moved half of the "flaky" drives on an AHCI controller. This seems to work much better, starting with disk detection issues=20 being solved, and hotplug working exactly like SCSI does. It does=20 require using camcontrol to recognize the disks again, but that much is=20 not exactly a problem. Although I'm probably dreaming : Did anyone heard about AHCI controllers = beside the on-board ones ? Thanks to everyone for their advice. --=20 Stephane LAPIE, EPITA SRS, Promo 2005 "Even when they have digital readouts, I can't understand them." --MegaTokyo --------------enigAD01E28EFD635FCFABC45B17 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktuKSQACgkQ24Ql8u6TF2Mi0wCeMGhdcjKdsyf7TBUyBF1L2n/4 WH8Anjbf3lVlT2hX7D8dABVqck5WAdPv =DwuS -----END PGP SIGNATURE----- --------------enigAD01E28EFD635FCFABC45B17--