From owner-freebsd-questions@FreeBSD.ORG Thu Dec 1 19:38:30 2005 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5CB6616A422 for ; Thu, 1 Dec 2005 19:38:30 +0000 (GMT) (envelope-from njt@ayvali.org) Received: from sanddollar.geekisp.com (sanddollar.geekisp.com [204.89.131.97]) by mx1.FreeBSD.org (Postfix) with ESMTP id 97AA943D6D for ; Thu, 1 Dec 2005 19:38:17 +0000 (GMT) (envelope-from njt@ayvali.org) Received: (qmail 31829 invoked by uid 1003); 1 Dec 2005 19:38:14 -0000 Received: from clam.int.geekisp.com (HELO clam.geekisp.com) (192.168.4.38) by mail.geekisp.com with (DHE-RSA-AES256-SHA encrypted) SMTP; 1 Dec 2005 19:38:14 -0000 Received: from clam.geekisp.com (njt@localhost.geekisp.com [127.0.0.1]) by clam.geekisp.com (8.13.3/8.12.11) with ESMTP id jB1JcDnr016997 for ; Thu, 1 Dec 2005 14:38:13 -0500 (EST) Received: (from njt@localhost) by clam.geekisp.com (8.13.3/8.13.3/Submit) id jB1JcDua028747 for freebsd-questions@freebsd.org; Thu, 1 Dec 2005 14:38:13 -0500 (EST) X-Authentication-Warning: clam.geekisp.com: njt set sender to njt@ayvali.org using -f Date: Thu, 1 Dec 2005 14:38:13 -0500 From: "N.J. Thomas" To: freebsd-questions@freebsd.org Message-ID: <20051201193813.GG15171@ayvali.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.9i Subject: overloaded webserver: nfs wait issue? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2005 19:38:30 -0000 We have a website with moderately high traffic, load balanced among 3 webservers. During peak traffic times however (when the volume is higher than normal), the load shoots up to over a 100, and the site crawls to its knees. We set up a script to take snapshots of top every 20 seconds. Here is what it looks like when everthing is normal: 127 last pid: 12003; load averages: 0.93, 1.36, 1.35 up 41+04:22:14 14:00:23 243 processes: 12 running, 230 sleeping, 1 zombie Mem: 222M Active, 74M Inact, 186M Wired, 16M Cache, 111M Buf, 503M Free Swap: 2048M Total, 16M Used, 2032M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 136 root 32 0 1208K 420K RUN 33.1H 7.28% 7.28% amd 11918 nobody -1 0 149M 12292K nfsrcv 0:01 3.00% 1.95% httpd 11879 nobody 2 0 149M 12292K sbwait 0:01 2.10% 1.37% httpd 11896 nobody 2 0 148M 11704K RUN 0:00 1.80% 1.17% httpd 11962 nobody 2 0 147M 10072K RUN 0:00 4.33% 1.12% httpd 11892 nobody -1 0 145M 8804K nfsrcv 0:00 1.35% 0.88% httpd 11935 nobody 2 0 149M 12284K sbwait 0:00 1.73% 0.78% httpd 11925 nobody 2 0 149M 12288K sbwait 0:00 1.08% 0.68% httpd 11894 nobody 2 0 149M 12404K sbwait 0:00 0.98% 0.63% httpd 11937 nobody 2 0 149M 12456K RUN 0:00 1.61% 0.63% httpd 11954 nobody 2 0 149M 12288K sbwait 0:00 1.88% 0.49% httpd 191 root 2 0 144M 6632K select 13:23 0.34% 0.34% httpd 11930 nobody 2 0 145M 8852K sbwait 0:00 0.62% 0.34% httpd 11872 nobody 2 0 149M 12288K sbwait 0:00 0.45% 0.29% httpd 11911 nobody 2 0 148M 11604K accept 0:00 0.45% 0.29% httpd 11893 nobody 2 0 149M 12392K sbwait 0:00 0.38% 0.24% httpd 11876 nobody 2 0 149M 12264K sbwait 0:00 0.38% 0.24% httpd 11934 nobody 2 0 149M 12292K accept 0:00 0.41% 0.20% httpd When the load shoots up, the number of http clients hits Apache's MaxClients setting, here is what top shows: last pid: 12407; load averages: 87.84, 51.91, 27.52 up 41+04:40:51 14:19:00 268 processes: 2 running, 266 sleeping Mem: 715M Active, 68M Inact, 187M Wired, 29M Cache, 111M Buf, 2100K Free Swap: 2048M Total, 272M Used, 1776M Free, 13% Inuse PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND 136 root 64 0 1208K 376K RUN 33.1H 2.69% 2.69% amd 11965 nobody -1 0 149M 6892K nfsrcv 0:05 0.24% 0.24% httpd 11913 nobody -1 0 149M 8300K nfsrcv 0:05 0.20% 0.20% httpd 11878 nobody -1 0 149M 8572K nfsrcv 0:09 0.15% 0.15% httpd 11948 nobody -1 0 149M 8852K nfsrcv 0:07 0.15% 0.15% httpd 11982 nobody -1 0 149M 6764K nfsrcv 0:04 0.15% 0.15% httpd 11912 nobody -1 0 149M 4912K nfsrcv 0:06 0.10% 0.10% httpd 12060 nobody -1 0 149M 7356K nfsrcv 0:05 0.10% 0.10% httpd 11999 nobody -1 0 149M 8352K nfsrcv 0:04 0.10% 0.10% httpd 12122 nobody -1 0 149M 8296K nfsrcv 0:04 0.10% 0.10% httpd 12028 nobody -1 0 149M 8664K nfsrcv 0:04 0.10% 0.10% httpd 12267 nobody -1 0 149M 8452K nfsrcv 0:03 0.10% 0.10% httpd 12270 nobody -1 0 150M 7156K nfsrcv 0:02 0.10% 0.10% httpd 11983 nobody -1 0 149M 8256K nfsrcv 0:09 0.05% 0.05% httpd 11977 nobody -1 0 149M 5488K nfsrcv 0:06 0.05% 0.05% httpd 11952 nobody -1 0 149M 6704K nfsrcv 0:06 0.05% 0.05% httpd 11895 nobody -1 0 148M 4404K nfsrcv 0:06 0.05% 0.05% httpd 11885 nobody -1 0 149M 8348K nfsrcv 0:06 0.05% 0.05% httpd The state of all the httpd prcesses are "nfsrcv". Does this mean the bottleneck is at the NFS server that hosts the htdocs (and PHP scripts) or just that the server is low on memory? Thomas -- N.J. Thomas njt@ayvali.org Etiamsi occiderit me, in ipso sperabo