From owner-freebsd-stable@FreeBSD.ORG Fri Mar 21 07:11:11 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 591CB32B; Fri, 21 Mar 2014 07:11:11 +0000 (UTC) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.116.12]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 68C33205; Fri, 21 Mar 2014 07:11:10 +0000 (UTC) Received: from th-04.cs.huji.ac.il ([132.65.80.125]) by kabab.cs.huji.ac.il with esmtp id 1WQtbd-000OzM-7h; Fri, 21 Mar 2014 09:11:05 +0200 Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: nfsd panic From: Daniel Braniss In-Reply-To: <532B3FF7.1030405@FreeBSD.org> Date: Fri, 21 Mar 2014 09:11:04 +0200 Message-Id: <9E40826A-1850-467E-A7F7-A662063CE36E@cs.huji.ac.il> References: <50E659BE-3E1A-41D6-B522-9452093CEE26@cs.huji.ac.il> <201403201108.09700.jhb@freebsd.org> <532B3FF7.1030405@FreeBSD.org> To: Alexander Motin X-Mailer: Apple Mail (2.1874) Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.17 Cc: Rick Macklem , freebsd-stable@freebsd.org, John Baldwin X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Mar 2014 07:11:11 -0000 On Mar 20, 2014, at 9:22 PM, Alexander Motin wrote: > On 20.03.2014 17:08, John Baldwin wrote: >> On Thursday, March 20, 2014 8:35:37 am Daniel Braniss wrote: >>> this host has been doing fine, but today it=92s constantly crashing = in nfsd >>> it=92s exporting a 32TB zfs via nfs, to several hungry hosts >>> any help is appreciated since this is a production server and some = user are >>> not very happy :-) >>>=20 >>> http://www.cs.huji.ac.il/~danny/core.txt.7 >>=20 >> I think the pool->sp_lock mutex is not locked. Can you go to frame 8 = in kgdb >> and do 'p *m=92? (kgdb) frame 8 #8 0xffffffff808cf25a in _mtx_unlock_sleep (m=3D0xfffffe002c132400,=20 opts=3D, file=3D,=20 line=3D) at = /r+d/stable/9/sys/kern/kern_mutex.c:716 716 /r+d/stable/9/sys/kern/kern_mutex.c: No such file or directory. in /r+d/stable/9/sys/kern/kern_mutex.c (kgdb) p *m $1 =3D {lock_object =3D {lo_name =3D 0xffffffff80f5d757 "sp_lock",=20 lo_flags =3D 16973824, lo_data =3D 0, lo_witness =3D 0x0}, mtx_lock = =3D 4} >=20 > Daniel, your system looks like updated on February 2, but there was = alike bug fixed in stable/9 on February 7 (r261578). Please try to = update your system. >=20 I did and now it=92s ok again - though will have to wait till traffic = picks up again next week thanks, danny > --=20 > Alexander Motin thanks to you all, danny