From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 22:16:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 032F3106564A for ; Fri, 22 Jul 2011 22:16:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id B19BD8FC14 for ; Fri, 22 Jul 2011 22:16:50 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAFP2KU6DaFvO/2dsb2JhbABTG4Qxo3eJAKpjkHqFMIEPBJJuiDGISg X-IronPort-AV: E=Sophos;i="4.67,249,1309752000"; d="c'?scan'208";a="132004980" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 22 Jul 2011 18:16:49 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B9430B3F66; Fri, 22 Jul 2011 18:16:49 -0400 (EDT) Date: Fri, 22 Jul 2011 18:16:49 -0400 (EDT) From: Rick Macklem To: Clinton Adams Message-ID: <1730895125.912894.1311373009726.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_912893_1681470114.1311373009724" X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: FreeBSD FS Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 22:16:51 -0000 ------=_Part_912893_1681470114.1311373009724 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Clinton Adams wrote: [stuff snipped for brevity] > > Running four clients now and the LockOwners are steadily climbing, > nfsstat consistently reported it as 0 prior to users logging into the > nfsv4 test systems - my testing via ssh didn't show anything like > this. Attached tcpdump file is from when I first noticed the jump in > LockOwners from 0 to ~600. I tried wireshark on this and didn't see > any releaselockowner operations. > [stuff snipped for brevity] > OpenOwner Opens LockOwner Locks Delegs > 6 242 2481 22 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 2 2518251 2502 4772 > I've written a small test program: http://people.freebsd.org/~rmacklem/childlock.c (also attached) where a parent process opens a file and then forks children that do lock ops and then exit. (I'm guessing that this is what some process in your clients are doing, that result in the LockOwner count growing.) When I run this program on Fedora15, it generates ReleaseLockOwner Ops and the LockOwner count doesn't increase as it runs. You can run this program by giving it an argument that can be any file on the nfsv4 mount for which you have read/write access, then watch the server via "nfsstat -e -s" to see if the LockOwner count increases. If the LockOwner count does increase, then it appears that a newer Linux kernel will avoid the problem. If you are interested in what the packet trace looks like when running the program on Fedora15, it's at: http://people.freebsd.org/~rmacklem/childlock.pcap rick ps: The FreeBSD NFSv4 client doesn't currently generate the ReleaseLockOwner Ops for this case either. I need to come up with a patch that does that. ------=_Part_912893_1681470114.1311373009724--