From owner-freebsd-stable@FreeBSD.ORG Wed May 24 16:58:09 2006 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A8A9716A5B5 for ; Wed, 24 May 2006 16:58:09 +0000 (UTC) (envelope-from Joerg.Lehners@Informatik.Uni-Oldenburg.DE) Received: from arbi.Informatik.Uni-Oldenburg.DE (arbi.informatik.uni-oldenburg.de [134.106.1.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id B3C0B43D49 for ; Wed, 24 May 2006 16:58:08 +0000 (GMT) (envelope-from Joerg.Lehners@Informatik.Uni-Oldenburg.DE) Received: from gneedle.Informatik.Uni-Oldenburg.DE ([134.106.11.34]) by arbi.Informatik.Uni-Oldenburg.DE (Exim 3.36) id 1FiwgQ-0005wv-00; Wed, 24 May 2006 18:58:06 +0200 Received: from localhost (missingIP) by gneedle.Informatik.Uni-Oldenburg.DE (Exim 3.36) id 1FiwgQ-0006lP-00; Wed, 24 May 2006 18:58:06 +0200 Date: Wed, 24 May 2006 18:58:06 +0200 (MEST) From: "Joerg Lehners" To: freebsd-stable@freebsd.org In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: Re: Trouble with NFSd under 6.1-Stable, any ideas? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 May 2006 16:58:13 -0000 "Rong-en Fan" wrote: > On 5/14/06, Kris Kennaway wrote: >> On Sun, May 14, 2006 at 02:28:55PM -0400, Howard Leadmon wrote: >>> [...] >> Use tcpdump and related tools to find out what traffic is being sent. >> >> Also verify that you did not change your system configuration in any >> way: there have been no changes to NFS since the release, so it is >> unclear why an update would cause the problem to suddenly occur. >> >> Kris > > Hi Kris and Howard, > > As I posted few days ago, I have similar problems like Howard's > (some details in the thread "6.1-RELEASE, em0 high interrupt rate > and nfsd eats lots of cpu" on stable@). After binary searching > the source tree, I found that > > RELENG_6_1, 2006.04.30.03.57 ok > RELENG_6_1, 2006.04.30.04.00 bad > > The only commit is kern/vfs_lookup.c, an MFC of rev 1.90 and 1.91. > With 04.30 03.57's source + manaully patched vfs_lookup.c rev 1.90, > the same problem occurs. [...] Confirmed! I can create the problem here at will. Setup 1: NFS server 'testido' FreeBSD 6.1-STABLE as of 15. May 2006 with sys/kern/vfs_lookup.c 1.80.2.7, NFS schurks FreeBSD 6.1-STABLE as of 15. May 2006. /usr/src from testido mounted on /mnt on schurks. running 'cd /mnt ; du >/dev/null' two times (first after fresh boot of testido second when all served data is in memory of testido): joerg @ schurks> cd /mnt joerg @ schurks> time du >/dev/null 86.09s real 0.14s user 1.91s system joerg @ schurks> time du >/dev/null 205.10s real 0.20s user 1.92s system joerg @ schurks> Screenfull output of top on testido AFTER both tests (testido stopped responding to screen output sometimes, especially during the second test): last pid: 329; load averages: 4.14, 2.77, 1.25 up 0+00:07:30 18:44:47 29 processes: 1 running, 28 sleeping CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 8420K Active, 28M Inact, 72M Wired, 110M Buf, 880M Free Swap: 4000M Total, 4000M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 201 root 1 4 0 1232K 792K - 4:42 116.31% nfsd 329 joerg 1 96 0 2404K 1676K RUN 0:00 0.00% top 168 root 1 115 0 2456K 1760K select 0:00 0.00% sshd 313 root 1 96 0 1428K 1168K select 0:00 0.00% rlogind 194 root 1 115 0 1556K 1256K select 0:00 0.00% mountd 299 root 1 8 0 1720K 1436K wait 0:00 0.00% login 314 root 1 8 0 1748K 1460K wait 0:00 0.00% login 298 root 1 96 0 1304K 1048K select 0:00 0.00% rlogind 199 root 1 4 0 1356K 1040K accept 0:00 0.00% nfsd 256 root 1 96 0 2892K 1760K select 0:00 0.00% ntpd 315 joerg 1 20 0 1448K 1020K pause 0:00 0.00% ksh 300 root 1 5 0 1448K 996K ttyin 0:00 0.00% ksh 158 root 1 96 0 1332K 940K select 0:00 0.00% syslogd 163 root 1 96 0 1448K 1128K select 0:00 0.00% inetd 176 root 1 96 0 1408K 1044K select 0:00 0.00% rpcbind 185 root 1 96 0 1476K 1148K select 0:00 0.00% ypbind 261 root 1 115 0 1304K 952K select 0:00 0.00% lpd Setup 2: NFS server 'testido' FreeBSD 6.1-STABLE as of 15. May 2006 with sys/kern/vfs_lookup.c 1.80.2.6, NFS schurks FreeBSD 6.1-STABLE as of 15. May 2006. Same tests as before: joerg @ schurks> time du >/dev/null 22.63s real 0.15s user 1.82s system joerg @ schurks> time du >/dev/null 16.52s real 0.17s user 1.68s system joerg @ schurks> Screenfull output of top on testido AFTER both tests (testido responded fine during both tests): last pid: 329; load averages: 0.49, 0.26, 0.10 up 0+00:01:50 18:35:30 29 processes: 1 running, 28 sleeping CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 8424K Active, 28M Inact, 72M Wired, 110M Buf, 880M Free Swap: 4000M Total, 4000M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 201 root 1 4 0 1232K 792K - 0:03 3.76% nfsd 168 root 1 115 0 2456K 1760K select 0:00 0.00% sshd 329 joerg 1 96 0 2404K 1676K RUN 0:00 0.00% top 313 root 1 96 0 1428K 1168K select 0:00 0.00% rlogind 194 root 1 115 0 1556K 1256K select 0:00 0.00% mountd 299 root 1 8 0 1720K 1440K wait 0:00 0.00% login 314 root 1 8 0 1748K 1464K wait 0:00 0.00% login 298 root 1 96 0 1304K 1048K select 0:00 0.00% rlogind 199 root 1 4 0 1356K 1040K accept 0:00 0.00% nfsd 315 joerg 1 20 0 1448K 1020K pause 0:00 0.00% ksh 256 root 1 96 0 2892K 1760K select 0:00 0.00% ntpd 300 root 1 5 0 1448K 996K ttyin 0:00 0.00% ksh 158 root 1 96 0 1332K 940K select 0:00 0.00% syslogd 163 root 1 96 0 1448K 1128K select 0:00 0.00% inetd 261 root 1 109 0 1304K 952K select 0:00 0.00% lpd 176 root 1 96 0 1408K 1044K select 0:00 0.00% rpcbind 185 root 1 96 0 1476K 1148K select 0:00 0.00% ypbind See the HUGE difference in consumed TIME. The only difference was sys/kern/vfs_lookup.c version 1.80.2.6 vs. 1.80.2.7. Joerg -- Mail: Joerg.Lehners@Informatik.Uni-Oldenburg.DE Tel: 2198 Real: Joerg Lehners, Informatik ARBI, Uni Oldenburg, D-26111 Oldenburg Unwoerter: Kostensenkung - Gewinnmaximierung - billig, billig, billig