Date: Thu, 09 Dec 2010 12:31:04 +0100 From: Laszlo Nagy <gandalf@shopzeus.com> To: questions@freebsd.org Cc: danieleff@gmail.com Subject: What is loading my server so much? Message-ID: <4D00BDF8.6020206@shopzeus.com>
next in thread | raw e-mail | index | archive | help
System is FreeBSD shopzeus.com 8.1-STABLE FreeBSD 8.1-STABLE #0: Sun Oct 31 02:55:28 EDT 2010 amd64 It has two quad-core Xeon CPUs, 24GB memory, and a RAID 1+0 array with 10 disks + Areca 1680 controller with 2GB write back cache. Server is running: mailscanner + apache multihost + PHP + postgresql. Main load on the server is usually postgresql. Today something happened. Number of http processes went up to 200. As a result, number of connections to database also went up to 200, and the web server is now refusing clients with "Cannot connect to database" messages (coming from PHP). This is a typical output from top: last pid: 12789; load averages: 7.77, 10.77, 13.46 up 26+03:00:30 06:22:04 6637 processes: 7 running, 623 sleeping, 7 zombie CPU: 32.9% user, 0.0% nice, 7.6% system, 0.6% interrupt, 58.9% idle Mem: 3885M Active, 15G Inact, 3236M Wired, 627M Cache, 2465M Buf, 656M Free Swap: 12G Total, 12M Used, 12G Free PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 66834 pgsql 1 118 0 443M 417M CPU2 2 16:17 99.46% postgres 11473 pgsql 1 72 0 441M 242M sbwait 5 0:02 4.59% postgres 11026 pgsql 1 47 0 439M 249M sbwait 7 0:01 3.17% postgres 6642 www 1 48 0 236M 42928K select 0 0:01 2.29% httpd 10147 www 1 48 0 236M 44048K select 6 0:01 2.10% httpd 3961 shopzeus 29 44 0 208M 96364K uwait 4 18.4H 1.37% python Here is what I don't understand. "last pid" is increasing relatively slowly, e.g. there are no hidden processes. Only the first one or two processes are showing CPU load > 10%. The "CPU User%" value is about 50%. We have lots of free memory. I/O load is almost nothing (see iostat below). However, server load is between 7 and 13! In fact sometimes it is above 16. And everybody complains that the server is too slow. How can I find out what is causing the problem? Example gstat output: dT: 1.006s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name 0 0 0 0 0.0 0 0 0.0 0.0| ad4 0 0 0 0 0.0 0 0 0.0 0.0| ad4s1 0 0 0 0 0.0 0 0 0.0 0.0| ad4s1d 0 0 0 0 0.0 0 0 0.0 0.0| da0 0 0 0 0 0.0 0 0 0.0 0.0| da0s1 1 304 3 34 14.0 301 7522 0.2 5.1| da1 0 2 2 32 11.9 0 0 0.0 2.4| da2 0 0 0 0 0.0 0 0 0.0 0.0| da3 0 0 0 0 0.0 0 0 0.0 0.0| da4 0 0 0 0 0.0 0 0 0.0 0.0| da0s1a 0 0 0 0 0.0 0 0 0.0 0.0| da0s1b 0 0 0 0 0.0 0 0 0.0 0.0| da0s1d 0 0 0 0 0.0 0 0 0.0 0.0| da0s1e 1 304 3 34 14.0 301 7522 0.3 5.3| da1s1 0 2 2 32 11.9 0 0 0.0 2.4| da2s1 0 0 0 0 0.0 0 0 0.0 0.0| da3s1 0 0 0 0 0.0 0 0 0.0 0.0| da4s1 1 304 3 34 14.0 301 7522 0.4 5.4| da1s1d 0 2 2 32 11.9 0 0 0.0 2.4| da2s1d 0 0 0 0 0.0 0 0 0.0 0.0| da3s1d Example iostat output: tty ad4 da0 da1 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 349 30.81 16 0.49 16.51 11 0.18 22.56 124 2.72 29 0 9 1 61 0 9282 0.00 0 0.00 0.00 0 0.00 16.00 7 0.11 41 0 11 1 47 0 12520 0.00 0 0.00 0.00 0 0.00 18.00 8 0.14 45 0 14 0 41 0 12205 0.00 0 0.00 0.00 0 0.00 0.00 0 0.00 38 0 15 0 47 Example systat output: /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> /0% /10 /20 /30 /40 /50 /60 /70 /80 /90 /100 pgsql postgres XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXXXXX www httpd XXXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXXXXXX root idle XXXXXXXXXXXXXXXXXXX root idle XXXXXXXX www httpd XXXXXX pgsql postgres XXX pgsql postgres X www httpd X root intr X www httpd X www httpd X www httpd X www httpd X shopzeus python X www httpd X www httpd X www httpd X www httpd X www httpd X www httpd X www httpd X www httpd X zeusd1 python X www httpd X www httpd X www httpd X www httpd X www httpd X www httpd X www httpd X Looks like the server is almost idle. So how can I have load = 12 and similar values? Thanks, Laszlo
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D00BDF8.6020206>