From owner-freebsd-current@FreeBSD.ORG Mon Mar 3 13:34:36 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1E9D1065672; Mon, 3 Mar 2008 13:34:36 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay03.kiev.sovam.com (relay03.kiev.sovam.com [62.64.120.201]) by mx1.freebsd.org (Postfix) with ESMTP id 1975F8FC1B; Mon, 3 Mar 2008 13:34:35 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [212.82.216.226] (helo=skuns.kiev.zoral.com.ua) by relay03.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.67) (envelope-from ) id 1JWAoJ-0006OQ-RV; Mon, 03 Mar 2008 15:34:34 +0200 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by skuns.kiev.zoral.com.ua (8.14.2/8.14.2) with ESMTP id m23DYNA7069969 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 3 Mar 2008 15:34:23 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2) with ESMTP id m23DY7U3029530; Mon, 3 Mar 2008 15:34:07 +0200 (EET) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.2/8.14.2/Submit) id m23DY7jd029529; Mon, 3 Mar 2008 15:34:07 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 3 Mar 2008 15:34:06 +0200 From: Kostik Belousov To: pluknet Message-ID: <20080303133406.GV57756@deviant.kiev.zoral.com.ua> References: Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/q9mwuE15ZemPHDK" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.2.4 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on skuns.kiev.zoral.com.ua X-Scanner-Signature: 547e95e07853df1f7f8c99bbf2c45db3 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Not Detected X-SpamTest-Info: Profiles 2348 [Mar 3 2008] X-SpamTest-Info: helo_type=3 X-SpamTest-Info: {TO: seems autogenerated} X-SpamTest-Info: {TO: local part of email appears in body} X-SpamTest-Method: none X-SpamTest-Rate: 19 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0278], KAS30/Release Cc: FreeBSD Current Subject: Re: panic: mutex Giant owned at nfs_syscalls.c:556 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Mar 2008 13:34:36 -0000 --/q9mwuE15ZemPHDK Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Mar 03, 2008 at 05:41:33AM +0300, pluknet wrote: > On 20/02/2008, pluknet wrote: > [snip] > > Unread portion of the kernel message buffer: > > panic: mutex Giant owned at > > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 > > KDB: enter: panic > > exclusive sleep mutex nfsd_mtx r =3D 0 (0xc41d1660) locked @ > > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:501 > > exclusive sleep mutex Giant r =3D 0 (0xc07e6410) locked @ > > /usr/src/sys/kern/vfs_lookup.c:663 > > ... > > #9 0xc053959d in panic (fmt=3D0xc076181d "mutex %s owned at %s:%d") > > at /usr/src/sys/kern/kern_shutdown.c:555 > > #10 0xc052adf7 in _mtx_assert (m=3D0xc07e6410, what=3D0, > > file=3D0xc41cb7b2 > > "/usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c", > > line=3D556) at /usr/src/sys/kern/kern_mutex.c:652 > > #11 0xc41c9e82 in nfssvc (td=3D0xc2e68000, uap=3D0xd600dcfc) > > at /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 > > #12 0xc0727903 in syscall (frame=3D0xd600dd38) > > at /usr/src/sys/i386/i386/trap.c:1034 > > #13 0xc0711630 in Xint0x80_syscall () at /usr/src/sys/i386/i386/except= ion.s:203 > > ---Type to continue, or q to quit--- > > #14 0x00000033 in ?? () >=20 > [Some new info probably not related to the thread.] > Today I got two new panics while performing /etc/rc.d/nfsd stop: >=20 > System call nfssvc returning with the following locks held: > exclusive sleep mutex Giant r =3D 2 (0xc07e6410) locked @ > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c:1106 > panic: witness_warn > ... > #9 0xc053959d in panic (fmt=3D0xc076791b "witness_warn") > at /usr/src/sys/kern/kern_shutdown.c:555 > #10 0xc056ff99 in witness_warn (flags=3D2, lock=3D0x0, > fmt=3D0xc0789975 "System call %s returning") > at /usr/src/sys/kern/subr_witness.c:1396 > #11 0xc0727a56 in syscall (frame=3D0xd5f9ed38) > at /usr/src/sys/i386/i386/trap.c:1083 > #12 0xc0711630 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception= .s:203 > #13 0x00000033 in ?? () >=20 > The second is different only in place where Giant is acquired. > System call nfssvc returning with the following locks held: > exclusive sleep mutex Giant r =3D 0 (0xc07e6410) locked @ > /usr/src/sys/kern/vfs_lookup.c:663 > panic: witness_warn >=20 >=20 > On 21/02/2008, Kris Kennaway wrote: > > Hopefully it should be a straightforward fix to track down which code > > path is missing the VFS_UNLOCK_GIANT() to reach the above stack trace, > > or to reproduce. >=20 > I've marked VFS_[UN]LOCK_GIANT() in nfs_(serv|srvsubs).c with > appropriate printf()'s: > $ rm 123vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 253 >=20 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 253 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 648 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 648 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2161 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 > vfslocked lock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 > vfslocked unlock in > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2234 > panic: mutex Giant owned at > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 >=20 > So, Giant is acquired in nfs_srvsubs.c:1106 and is not released in > nfs_serv.c:648 > Btw, I cannot reproduce it with module compiled via > cd /sys/modules/nfsserver; make all && make install >=20 > On 23/02/2008, Kostik Belousov wrote: > > Assert itself only catches the missed Giant unlock somewhere during > > the execution of an nfs requtest. > > Unfortunately, it seems that the error is in nfsserver that missed > > Giant unlock (most likely, on some error path). Helpful would be > > the tcpdump raw trace file of the communication between nfs server and > > client immediately before the panic. >=20 > Here it is (with -xvvn). [I can reproduce the panic only when I try to > delete a file on nfs.] I mean the raw dump that can be loaded in wireshark, for instance. I prefer to not do the manual RPC decoding. >=20 > 03:07:59.342714 IP (tos 0x0, ttl 128, id 39253, offset 0, flags > [none], proto UDP (17), length 136) 192.168.200.185.1968921799 > > 192.168.200.137.2049: 108 access [|nfs] > 0x0000: 4500 0088 9955 0000 8011 8e7b c0a8 c8b9 > 0x0010: c0a8 c889 03f8 0801 0074 3c9e 755b 5cc7 > 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 > 0x0030: 0000 0004 0000 0001 0000 0020 0000 0000 > 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 > 0x0050: 0000 > 03:07:59.343818 IP (tos 0x0, ttl 128, id 30823, offset 0, flags > [none], proto UDP (17), length 148) 192.168.200.137.2049 > > 192.168.200.185.1968921799: reply ok 120 access attr: REG 755 ids > 1001/0 [|nfs] > 0x0000: 4500 0094 7867 0000 8011 af5d c0a8 c889 > 0x0010: c0a8 c8b9 0801 03f8 0080 1bda 755b 5cc7 > 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 > 0x0030: 0000 0000 0000 0000 0000 0001 0000 0001 > 0x0040: 0000 01ed 0000 0001 0000 03e9 0000 0000 > 0x0050: 0000 > 03:07:59.344297 IP (tos 0x0, ttl 128, id 20293, offset 0, flags > [none], proto UDP (17), length 140) 192.168.200.185.1968921800 > > 192.168.200.137.2049: 112 lookup [|nfs] > 0x0000: 4500 008c 4f45 0000 8011 d887 c0a8 c8b9 > 0x0010: c0a8 c889 03f8 0801 0078 58a1 755b 5cc8 > 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 > 0x0030: 0000 0003 0000 0001 0000 0020 0000 0000 > 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 > 0x0050: 0000 > 03:07:59.344782 IP (tos 0x0, ttl 128, id 64063, offset 0, flags > [none], proto UDP (17), length 264) 192.168.200.137.2049 > > 192.168.200.185.1968921800: reply ok 236 lookup [|nfs] > 0x0000: 4500 0108 fa3f 0000 8011 2d11 c0a8 c889 > 0x0010: c0a8 c8b9 0801 03f8 00f4 feb1 755b 5cc8 > 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 > 0x0030: 0000 0000 0000 0000 0000 001c 5100 0000 > 0x0040: 0400 0000 0c00 0000 4dc9 1000 8001 0000 > 0x0050: 0000 > 03:07:59.345530 IP (tos 0x0, ttl 128, id 32619, offset 0, flags > [none], proto UDP (17), length 140) 192.168.200.185.1968921801 > > 192.168.200.137.2049: 112 lookup [|nfs] > 0x0000: 4500 008c 7f6b 0000 8011 a861 c0a8 c8b9 > 0x0010: c0a8 c889 03f8 0801 0078 58a0 755b 5cc9 > 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 > 0x0030: 0000 0003 0000 0001 0000 0020 0000 0000 > 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 > 0x0050: 0000 > 03:07:59.346752 IP (tos 0x0, ttl 128, id 50548, offset 0, flags > [none], proto UDP (17), length 264) 192.168.200.137.2049 > > 192.168.200.185.1968921801: reply ok 236 lookup [|nfs] > 0x0000: 4500 0108 c574 0000 8011 61dc c0a8 c889 > 0x0010: c0a8 c8b9 0801 03f8 00f4 feb0 755b 5cc9 > 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 > 0x0030: 0000 0000 0000 0000 0000 001c 5100 0000 > 0x0040: 0400 0000 0c00 0000 4dc9 1000 8001 0000 > 0x0050: 0000 > 03:07:59.347265 IP (tos 0x0, ttl 128, id 52799, offset 0, flags > [none], proto UDP (17), length 140) 192.168.200.185.1968921802 > > 192.168.200.137.2049: 112 remove [|nfs] > 0x0000: 4500 008c ce3f 0000 8011 598d c0a8 c8b9 > 0x0010: c0a8 c889 03f8 0801 0078 5896 755b 5cca > 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 > 0x0030: 0000 000c 0000 0001 0000 0020 0000 0000 > 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 > 0x0050: 0000 > 03:07:59.348654 IP (tos 0x0, ttl 128, id 55143, offset 0, flags > [none], proto UDP (17), length 172) 192.168.200.137.2049 > > 192.168.200.185.1968921802: reply ok 144 remove PRE: [|nfs] > 0x0000: 4500 00ac d767 0000 8011 5045 c0a8 c889 > 0x0010: c0a8 c8b9 0801 03f8 0098 1757 755b 5cca > 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 > 0x0030: 0000 0000 0000 0000 0000 0001 0000 0000 > 0x0040: 0000 4000 4099 2bf4 0000 0000 4099 2bf4 > 0x0050: 0000 >=20 > wbr, > pluknet --/q9mwuE15ZemPHDK Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (FreeBSD) iEYEARECAAYFAkfL/k4ACgkQC3+MBN1Mb4jl8wCdGLw7COPjcQt+ljbkxyxcl6Ww TrYAmgMm/yPVmT8FYVWBMlzJu8jwVqq3 =QIvB -----END PGP SIGNATURE----- --/q9mwuE15ZemPHDK--