From owner-freebsd-questions@FreeBSD.ORG Wed May 21 15:17:28 2008 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C87581065675 for ; Wed, 21 May 2008 15:17:28 +0000 (UTC) (envelope-from eagletree@hughes.net) Received: from n054.sc0.he.tucows.com (smtpout1096.sc0.he.tucows.com [64.97.144.96]) by mx1.freebsd.org (Postfix) with ESMTP id 9695F8FC13 for ; Wed, 21 May 2008 15:17:28 +0000 (UTC) (envelope-from eagletree@hughes.net) Received: from sc0-out01.emaildefenseservice.com (64.97.131.2) by n054.sc0.he.tucows.com (7.2.069.1) id 476BFC7B01BEFD11; Wed, 21 May 2008 15:17:28 +0000 X-SpamScore: 2 X-Spamcatcher-Summary: 2, 0, 0, 70c063df930dc54a, 2b856b17f6e9d82a, eagletree@hughes.net, -, RULES_HIT:355:379:541:564:599:601:945:946:966:967:973:980:988:989:1260:1261:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1534:1542:1593:1594:1711:1730:1747:1766:1792:2196:2198:2199:2200:2379:2393:2525:2553:2559:2563:2682:2685:2693:2857:2859:2933:2937:2939:2942:2945:2947:2951:2954:3022:3027:3355:3865:3866:3867:3868:3869:3870:3871:3872:3873:3874:3934:3936:3938:3941:3944:3947:3950:4250:4321: 4385:4470:4860:5007:6119:7652:7679:7903, 0, RBL:none, CacheIP:none, Bayesian:0.5, 0.5, 0.5, Netcheck:none, DomainCache:0, MSF:not bulk, SPF:, MSBL:none, DNSBL:none, TSO:0 X-Spamcatcher-Explanation: Received: from [192.168.0.3] (dpc6744118153.direcpc.com [67.44.118.153]) (Authenticated sender: eagletree@hughes.net) by sc0-out01.emaildefenseservice.com (Postfix) with ESMTP; Wed, 21 May 2008 15:17:22 +0000 (UTC) In-Reply-To: <38f284ee0805200717l7008e18fud9631bf80839ceb1@mail.gmail.com> References: <38f284ee0805200717l7008e18fud9631bf80839ceb1@mail.gmail.com> Mime-Version: 1.0 (Apple Message framework v753) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <7361201E-387B-44AA-BFE8-1AF2FE06380D@hughes.net> Content-Transfer-Encoding: 7bit From: Chris Pratt Date: Wed, 21 May 2008 08:05:51 -0700 To: Alan Gilmour X-Mailer: Apple Mail (2.753) Cc: freebsd-questions@freebsd.org Subject: Re: Server crashing, no explanations X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 21 May 2008 15:17:29 -0000 On May 20, 2008, at 7:17 AM, Alan Gilmour wrote: > Hey all, > > We have recently been getting a lot of traffic to one of our sites. > The CPU is consistently during busy periods using 100% utilisation. > When this happens we have approx 150 apache threads, and the loads > goes way above 15. > > However recently the server has been auto-restarting (when under heavy > load) with no explanation in any logs. I've checked the console log, > messages, db logs e.t.c. but no mention of anything wrong. > > Brief server summary : > > FreeBSD 6.3-STABLE #0: > CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.11-MHz 686-class CPU) > Logical CPUs per core: 2 > real memory = 17716740096 (16896 MB) > avail memory = 16837763072 (16057 MB) > FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs > > We tried installing mbmon and lmmon and healthd, but none seem to > work. > > Anyone got any suggestions for other things we can try to detect why > the server is failing? or other ways to check things like CPU temp and > memory status? We have experienced this since 6.x began and it's not hardware. It can be reproduced by moving the role to another similar server. When the role is changed and the traffic (not necessarily the load), the problem goes away or rather, will transfer to the new box. Look at the thread named "zonealarm issues" on Freebsd-Net a couple of months ago. You may find it will apply but there aren't any answers there yet. I gather that people need more data collection. I have never figured out how to get a dump though people have recommended things to try over the last couple of years. I was hoping 7.0 would be the solution but I'm told it's not. Reduce your traffic and the problem will go away. Split the traffic to more than one server is a way to do this. We increased our uptime drastically by doing this but we still get hit hard enough at times to go down. During our low traffic periods of the year, we simply stay up all the time (in the hottest days of summer). By the way, the symptom I see is never immediate reboot, it will hang for reasonable period of time prior to rebooting. As I monitor ours 24/7, I reset power on the box before it reboots to reduce the outage to customers. If I'm not watching it eventually will reboot. Brutal but it works. Realize it's possible you don't have this problem but there are a few of us who do. It has something to do with buffers not being freed up. > > Cheers > > Alan > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions- > unsubscribe@freebsd.org"