From owner-freebsd-questions Mon Aug 26 2:16:11 2002 Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BC94E37B400 for ; Mon, 26 Aug 2002 02:16:08 -0700 (PDT) Received: from smtp.infracaninophile.co.uk (happy-idiot-talk.infracaninophile.co.uk [81.2.69.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id AB56643E3B for ; Mon, 26 Aug 2002 02:16:07 -0700 (PDT) (envelope-from m.seaman@infracaninophile.co.uk) Received: from happy-idiot-talk.infracaninophile.co.uk ([IPv6:::1]) by smtp.infracaninophile.co.uk (8.12.5/8.12.5) with ESMTP id g7Q9G62b003535; Mon, 26 Aug 2002 10:16:06 +0100 (BST) (envelope-from matthew@happy-idiot-talk.infracaninophile.co.uk) Received: (from matthew@localhost) by happy-idiot-talk.infracaninophile.co.uk (8.12.5/8.12.5/Submit) id g7Q9G0JB003534; Mon, 26 Aug 2002 10:16:00 +0100 (BST) Date: Mon, 26 Aug 2002 10:16:00 +0100 From: Matthew Seaman To: Robert Covell Cc: freebsd-questions@FreeBSD.ORG Subject: Re: Incorrect Uptime Message-ID: <20020826091600.GA3238@happy-idiot-talk.infracaninophi> References: <003701c24c85$ee9fb300$6401a8c0@kc.rr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <003701c24c85$ee9fb300$6401a8c0@kc.rr.com> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, Aug 25, 2002 at 05:22:38PM -0500, Robert Covell wrote: > We have a mail server running FreeBSD 4.1.1-RELEASE. Every so > often, say once a month, uptime says the server is running at 100% > for 1, 5, and 15 displays. But when I go into top the system is > 100% idle (or very close to it). The only way I have found to fix > it is to reboot the box. I have found out that if uptime return > 1.00 for the one minute it means the cpu is at 0% utilization. If > it say 1.01 it is at 1% utilization. Anyone have an idea of why > this would be happening? We use uptime to monitor the performance > on the server, and cannot determine why this would be happening when > it is really not at 100%. The 1, 5 and 15 minute load averages aren't quite the same thing as CPU utilization. The load averages are a measure of the number of processes sitting in the queue requesting a time slice on the CPU. On an unloaded system, where there's plenty of spare CPU cycles, a process will get a time slice almost immediately so the load average won't be affected much. Now, the CPU utilization and the load averages usually correlate pretty well, but it is possible for the load average to increase without the CPU usage going up. This indicates that the kernel is so busy dealing with some other matter that it hasn't got round to dealing out time slices to processes very promptly. Usually that means some higher priority interrupt triggered by hardware. This can be an indication of failing hardware: the kernel is desperately trying to get a response out of a piece of equipment that has gone a bit catatonic. It can be down to something as trivial as a broken wire in a network cable, or as bad as an impending failure of your main hard drive. Make sure your backups are comprehensive and up to date. Survey the system log files for other evidence of problems --- the kernel will usually log something when it encounters such. Run healthd or xmbmon or the like to monitor motherboard and CPU temperatures --- overheating is one of the most common causes of things going horribly wrong. Schedule some down time to perform preventive maintenance like cleaning dust out of the fans and so forth. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. 26 The Paddocks Savill Way Marlow Tel: +44 1628 476614 Bucks., SL7 1TH UK To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message