From owner-freebsd-stable@FreeBSD.ORG Thu Feb 4 22:46:12 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81B401065695; Thu, 4 Feb 2010 22:46:12 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5919A8FC1E; Thu, 4 Feb 2010 22:46:11 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA02408; Fri, 05 Feb 2010 00:46:07 +0200 (EET) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1NdASd-0004N1-2x; Fri, 05 Feb 2010 00:46:07 +0200 Message-ID: <4B6B4E2E.2010902@icyb.net.ua> Date: Fri, 05 Feb 2010 00:46:06 +0200 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.23 (X11/20091128) MIME-Version: 1.0 To: Stephane LAPIE References: <4B682972.6030604@darkbsd.org> <4B682F29.90505@icyb.net.ua> <4B686324.2090308@elischer.org> <4B68641D.9000201@icyb.net.ua> <4B695CA3.50008@darkbsd.org> In-Reply-To: <4B695CA3.50008@darkbsd.org> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org, Julian Elischer , freebsd-hardware@freebsd.org Subject: Re: [zfs][hardware] Reproducible kernel panic in 8.0-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 04 Feb 2010 22:46:12 -0000 on 03/02/2010 13:23 Stephane LAPIE said the following: > > I just rebuilt a kernel with debugger options, and obtained the > following output upon pulling out one disk : > > Sleeping thread (tid 100024, pid 0) owns a non-sleepable lock > sched_switch() at sched_switch+0xf8 > mi_switch() at mi_switch+0x16f > sleepq_timedwait() at sleepq_timedwait+0x42 > _cv_timedwait() at _cv_timedwait+0x129 > _sema_timedwait() at _sema_timedwait+0x55 > ata_queue_request() at ata_queue_request+0x526 > ata_controlcmd() at ata_controlcmd+0xa1 > ata_setmode() at ata_setmode+0xdc > ad_init() at ad_init+0x27 > ad_reinit() at ad_reinit+0x48 > ata_reinit() at ata_reinit+0x268 > ata_conn_event() at ata_conn_event+0x49 > taskqueue_run() at taskqueue_run+0x93 > taskqueue_thread_loop() at taskqueue_thread_loop+0x46 > fork_exit() at fork_exit+0x118 > fork_trampoline() at fork_trampoline+0xe > --- trap 0, rip = 0, rsp = 0xffffff80000aad30, rbp = 0 --- > panic: sleeping thread > cpuid = 2 > KDB: enter: panic > [thread pid 12 tid 100008 ] > Stopped at kdb_enter+0x3d: movq $0,0x4943d0(%rip) Not sure if I can derive anything useful from here. Someone with more expertise is needed. One thing I noticed is that ata_conn_event and ata_reinit and some other functions up the stack acquire state_mtx recursively, but the mutex is not initialized with MTX_RECURSE. Perhaps, indeed you would have a better luck with AHCI controller _and_ ahci(4) driver. It seems to handle dynamic coming and going of disks much better than ata(4). -- Andriy Gapon