From owner-freebsd-stable@FreeBSD.ORG Wed Aug 11 02:35:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33F9B1065675 for ; Wed, 11 Aug 2010 02:35:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id E685B8FC16 for ; Wed, 11 Aug 2010 02:35:18 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAPeoYUyDaFvO/2dsb2JhbACDFZ4OqzeSAYEmgyFzBIlA X-IronPort-AV: E=Sophos;i="4.55,350,1278302400"; d="scan'208";a="87993856" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 10 Aug 2010 22:35:15 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 09BD6B3F21; Tue, 10 Aug 2010 22:35:18 -0400 (EDT) Date: Tue, 10 Aug 2010 22:35:17 -0400 (EDT) From: Rick Macklem To: alan bryan Message-ID: <1347461186.514707.1281494117995.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <881208.18587.qm@web50506.mail.re2.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [24.65.230.102] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - SAF3 (Mac)/6.0.7_GA_2473.RHEL4_64) Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD 8.1-Release NFSD hang in rc_lo state X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Aug 2010 02:35:19 -0000 > I'm doing some testing with loading up a new storage server. I have > couple ZFS filesystems exported via NFS over UDP. > > I have a client machine (8.1 also) that has mounted those filesystems > along with some test PHP scripts that are doing a ton of > read/write/fstat operations to load it up. > > If I ctrl-C to kill the scripts on the client I found that I can end > up with nfsd on the server stuck at 100% CPU and in the rc_lo state. > /etc/rc.d/nfsd restart does nothing. > > # nfsstat -s -w 1 -W > GtAttr Lookup Rdlink Read Write Rename Access Rddir > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > > > Client complaining with: > kernel: nfs server 192.168.1.2:/tank/alantest2: not responding > > > > Top on the server: > > last pid: 4776; load averages: 1.24, 1.15, 1.15 up 0+23:04:46 18:25:26 > 53 processes: 2 running, 35 sleeping, 16 lock > CPU: 0.0% user, 0.0% nice, 25.0% system, 0.0% interrupt, 75.0% idle > Mem: 21M Active, 5812K Inact, 2029M Wired, 88K Cache, 8592K Buf, 5857M > Free > Swap: 3942M Total, 3942M Free > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 922 root 44 0 5804K 1820K CPU2 2 63:52 100.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 1 30:23 0.00% {nfsd: service} > 922 root 48 0 5804K 1820K rpcsvc 1 30:20 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 1 30:17 0.00% {nfsd: service} > 922 root 48 0 5804K 1820K rpcsvc 1 30:17 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K rpcsvc 1 30:13 0.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 2 30:13 0.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 3 30:12 0.00% {nfsd: service} > 922 root 44 0 5804K 1820K *rc_lo 0 30:10 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 2 30:07 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 0 30:03 0.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 1 30:03 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 2 30:03 0.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 2 30:02 0.00% {nfsd: service} > 922 root 45 0 5804K 1820K *rc_lo 1 29:59 0.00% {nfsd: master} > 922 root 45 0 5804K 1820K *rc_lo 2 29:54 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 0 29:54 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 0 29:52 0.00% {nfsd: service} > 922 root 47 0 5804K 1820K *rc_lo 2 29:46 0.00% {nfsd: service} > 922 root 46 0 5804K 1820K *rc_lo 1 29:44 0.00% {nfsd: service} > I have a patch that I think might fix this. (Essentially the same bug was posted on another list recently.) I'll post it to Alan separately, but if anyone else wants to try it, it will be at http://people.freebsd.org/~rmacklem/freebsd8.1-patches/replay.patch in a few minutes. rick