From owner-freebsd-current@FreeBSD.ORG Mon Mar 3 02:41:36 2008 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4CE0F1065671 for ; Mon, 3 Mar 2008 02:41:36 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.170]) by mx1.freebsd.org (Postfix) with ESMTP id 883208FC1C for ; Mon, 3 Mar 2008 02:41:35 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: by ug-out-1314.google.com with SMTP id y2so1827511uge.37 for ; Sun, 02 Mar 2008 18:41:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=AJjTKOtpkiCrFHq9YC/ASUmicb93Q0ZxBaTlbpkrzks=; b=vYCm0JFtzqm/YtuU2Q+UfIRp2IQwfR0Es080ihYo5Ha/HnGGMMjo8ZfmaJEC6Nx6DArwsy90xfY3vHM7czSCALAkbpzDsDJfbm4WEJuH1flwOwuAH7ZTT7bKbiREqRGhIaTpyI2mY/2XXOHNUkg09afTZ5xcN1I73GaVgZBBkVg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=AI5MTjwBRBvLopoQItdM6wpI/65Acm4t+yOy+b0/9sWFoyFXmde5UDDCDWrPjS69VBvp3nDEdg5YbyAm41hna8N8f75fn91QX5PL1n2CcTXojsWZH11N1a+67Kuq+iBUiu6iI6AWSKwunTCyo6o1n5VirAW2IXdoDK6+NeRNjU8= Received: by 10.78.81.20 with SMTP id e20mr17119443hub.19.1204512093548; Sun, 02 Mar 2008 18:41:33 -0800 (PST) Received: by 10.78.46.11 with HTTP; Sun, 2 Mar 2008 18:41:33 -0800 (PST) Message-ID: Date: Mon, 3 Mar 2008 05:41:33 +0300 From: pluknet To: "FreeBSD Current" , "Kris Kennaway" , "Kostik Belousov" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: Cc: Subject: Re: panic: mutex Giant owned at nfs_syscalls.c:556 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 03 Mar 2008 02:41:36 -0000 On 20/02/2008, pluknet wrote: [snip] > Unread portion of the kernel message buffer: > panic: mutex Giant owned at > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 > KDB: enter: panic > exclusive sleep mutex nfsd_mtx r = 0 (0xc41d1660) locked @ > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:501 > exclusive sleep mutex Giant r = 0 (0xc07e6410) locked @ > /usr/src/sys/kern/vfs_lookup.c:663 > ... > #9 0xc053959d in panic (fmt=0xc076181d "mutex %s owned at %s:%d") > at /usr/src/sys/kern/kern_shutdown.c:555 > #10 0xc052adf7 in _mtx_assert (m=0xc07e6410, what=0, > file=0xc41cb7b2 > "/usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c", > line=556) at /usr/src/sys/kern/kern_mutex.c:652 > #11 0xc41c9e82 in nfssvc (td=0xc2e68000, uap=0xd600dcfc) > at /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 > #12 0xc0727903 in syscall (frame=0xd600dd38) > at /usr/src/sys/i386/i386/trap.c:1034 > #13 0xc0711630 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:203 > ---Type to continue, or q to quit--- > #14 0x00000033 in ?? () [Some new info probably not related to the thread.] Today I got two new panics while performing /etc/rc.d/nfsd stop: System call nfssvc returning with the following locks held: exclusive sleep mutex Giant r = 2 (0xc07e6410) locked @ /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c:1106 panic: witness_warn ... #9 0xc053959d in panic (fmt=0xc076791b "witness_warn") at /usr/src/sys/kern/kern_shutdown.c:555 #10 0xc056ff99 in witness_warn (flags=2, lock=0x0, fmt=0xc0789975 "System call %s returning") at /usr/src/sys/kern/subr_witness.c:1396 #11 0xc0727a56 in syscall (frame=0xd5f9ed38) at /usr/src/sys/i386/i386/trap.c:1083 #12 0xc0711630 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:203 #13 0x00000033 in ?? () The second is different only in place where Giant is acquired. System call nfssvc returning with the following locks held: exclusive sleep mutex Giant r = 0 (0xc07e6410) locked @ /usr/src/sys/kern/vfs_lookup.c:663 panic: witness_warn On 21/02/2008, Kris Kennaway wrote: > Hopefully it should be a straightforward fix to track down which code > path is missing the VFS_UNLOCK_GIANT() to reach the above stack trace, > or to reproduce. I've marked VFS_[UN]LOCK_GIANT() in nfs_(serv|srvsubs).c with appropriate printf()'s: $ rm 123vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 253 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 253 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 648 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 648 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2161 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106 vfslocked lock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916 vfslocked unlock in /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2234 panic: mutex Giant owned at /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556 So, Giant is acquired in nfs_srvsubs.c:1106 and is not released in nfs_serv.c:648 Btw, I cannot reproduce it with module compiled via cd /sys/modules/nfsserver; make all && make install On 23/02/2008, Kostik Belousov wrote: > Assert itself only catches the missed Giant unlock somewhere during > the execution of an nfs requtest. > Unfortunately, it seems that the error is in nfsserver that missed > Giant unlock (most likely, on some error path). Helpful would be > the tcpdump raw trace file of the communication between nfs server and > client immediately before the panic. Here it is (with -xvvn). [I can reproduce the panic only when I try to delete a file on nfs.] 03:07:59.342714 IP (tos 0x0, ttl 128, id 39253, offset 0, flags [none], proto UDP (17), length 136) 192.168.200.185.1968921799 > 192.168.200.137.2049: 108 access [|nfs] 0x0000: 4500 0088 9955 0000 8011 8e7b c0a8 c8b9 0x0010: c0a8 c889 03f8 0801 0074 3c9e 755b 5cc7 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 0x0030: 0000 0004 0000 0001 0000 0020 0000 0000 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 0x0050: 0000 03:07:59.343818 IP (tos 0x0, ttl 128, id 30823, offset 0, flags [none], proto UDP (17), length 148) 192.168.200.137.2049 > 192.168.200.185.1968921799: reply ok 120 access attr: REG 755 ids 1001/0 [|nfs] 0x0000: 4500 0094 7867 0000 8011 af5d c0a8 c889 0x0010: c0a8 c8b9 0801 03f8 0080 1bda 755b 5cc7 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 0x0030: 0000 0000 0000 0000 0000 0001 0000 0001 0x0040: 0000 01ed 0000 0001 0000 03e9 0000 0000 0x0050: 0000 03:07:59.344297 IP (tos 0x0, ttl 128, id 20293, offset 0, flags [none], proto UDP (17), length 140) 192.168.200.185.1968921800 > 192.168.200.137.2049: 112 lookup [|nfs] 0x0000: 4500 008c 4f45 0000 8011 d887 c0a8 c8b9 0x0010: c0a8 c889 03f8 0801 0078 58a1 755b 5cc8 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 0x0030: 0000 0003 0000 0001 0000 0020 0000 0000 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 0x0050: 0000 03:07:59.344782 IP (tos 0x0, ttl 128, id 64063, offset 0, flags [none], proto UDP (17), length 264) 192.168.200.137.2049 > 192.168.200.185.1968921800: reply ok 236 lookup [|nfs] 0x0000: 4500 0108 fa3f 0000 8011 2d11 c0a8 c889 0x0010: c0a8 c8b9 0801 03f8 00f4 feb1 755b 5cc8 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 0x0030: 0000 0000 0000 0000 0000 001c 5100 0000 0x0040: 0400 0000 0c00 0000 4dc9 1000 8001 0000 0x0050: 0000 03:07:59.345530 IP (tos 0x0, ttl 128, id 32619, offset 0, flags [none], proto UDP (17), length 140) 192.168.200.185.1968921801 > 192.168.200.137.2049: 112 lookup [|nfs] 0x0000: 4500 008c 7f6b 0000 8011 a861 c0a8 c8b9 0x0010: c0a8 c889 03f8 0801 0078 58a0 755b 5cc9 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 0x0030: 0000 0003 0000 0001 0000 0020 0000 0000 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 0x0050: 0000 03:07:59.346752 IP (tos 0x0, ttl 128, id 50548, offset 0, flags [none], proto UDP (17), length 264) 192.168.200.137.2049 > 192.168.200.185.1968921801: reply ok 236 lookup [|nfs] 0x0000: 4500 0108 c574 0000 8011 61dc c0a8 c889 0x0010: c0a8 c8b9 0801 03f8 00f4 feb0 755b 5cc9 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 0x0030: 0000 0000 0000 0000 0000 001c 5100 0000 0x0040: 0400 0000 0c00 0000 4dc9 1000 8001 0000 0x0050: 0000 03:07:59.347265 IP (tos 0x0, ttl 128, id 52799, offset 0, flags [none], proto UDP (17), length 140) 192.168.200.185.1968921802 > 192.168.200.137.2049: 112 remove [|nfs] 0x0000: 4500 008c ce3f 0000 8011 598d c0a8 c8b9 0x0010: c0a8 c889 03f8 0801 0078 5896 755b 5cca 0x0020: 0000 0000 0000 0002 0001 86a3 0000 0003 0x0030: 0000 000c 0000 0001 0000 0020 0000 0000 0x0040: 0000 0000 0000 03e9 0000 03e9 0000 0003 0x0050: 0000 03:07:59.348654 IP (tos 0x0, ttl 128, id 55143, offset 0, flags [none], proto UDP (17), length 172) 192.168.200.137.2049 > 192.168.200.185.1968921802: reply ok 144 remove PRE: [|nfs] 0x0000: 4500 00ac d767 0000 8011 5045 c0a8 c889 0x0010: c0a8 c8b9 0801 03f8 0098 1757 755b 5cca 0x0020: 0000 0001 0000 0000 0000 0000 0000 0000 0x0030: 0000 0000 0000 0000 0000 0001 0000 0000 0x0040: 0000 4000 4099 2bf4 0000 0000 4099 2bf4 0x0050: 0000 wbr, pluknet