From owner-freebsd-hackers@FreeBSD.ORG Sat Nov 19 03:05:47 2005 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B761016A420 for ; Sat, 19 Nov 2005 03:05:47 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: from angryfist.fasttrackmonkey.com (angryfist.fasttrackmonkey.com [216.220.107.230]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3DE9143D46 for ; Sat, 19 Nov 2005 03:05:44 +0000 (GMT) (envelope-from spork@fasttrackmonkey.com) Received: (qmail 14648 invoked by uid 2003); 19 Nov 2005 02:59:10 -0000 Received: from spork@fasttrackmonkey.com by angryfist.fasttrackmonkey.com by uid 1001 with qmail-scanner-1.20 (clamscan: 0.65. Clear:RC:1(216.220.116.154):. Processed in 0.048972 secs); 19 Nov 2005 02:59:10 -0000 Received: from unknown (HELO gee5.nat.fasttrackmonkey.com) (216.220.116.154) by 0 with (DHE-RSA-AES256-SHA encrypted) SMTP; 19 Nov 2005 02:59:10 -0000 Date: Fri, 18 Nov 2005 22:05:42 -0500 (EST) From: Charles Sprickman X-X-Sender: spork@gee5.nat.fasttrackmonkey.com To: freebsd-hackers@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Subject: 4.8 "Alternate system clock has died" error X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Nov 2005 03:05:47 -0000 Hello, I tried this query on -stable, hoping someone here can help me further understand and troubleshoot this. Reference: http://thread.gmane.org/gmane.os.freebsd.stable/32837 In short, top, ps report 0% CPU on all processes as of a few weeks ago. "systat -vmstat" hands out the "Alternate system clock has died" error. Box is running 4.8-p24 and has been up 425 days. Nothing out of the ordinary except for the above symptoms. In searching the various lists/newsgroups, it seems that the other folks with this problem have fixed it in various ways: -early 4.x users referenced a PR that was committed before 4.8 -some 5.3 users reported this with unknown resolution/cause -sending init a HUP was suggested (tried it, no luck) -setting kern.timecounter.method: 1 (tried it, no luck) -one user seemed to actually have a dead timer In the stable thread one person answered, and they were the sole example I could find of a true hardware failure. The odds are in my favor that it's a software issue I think... My hardware that I was given is probably a bit uncommon; it's an SMP Athlon box (Tyan S2462 THUNDER K7), probably not the most widely-tested platform. The -stable poster had a warning that if the RTC is bad, the machine likely won't come back up if I boot it. That has me very worried as this box is very important (mail server). Can anyone help me determine if this is a hardware problem? If it is, I really need to stretch the budget and dig up some new hardware to transplant everything into. Dmesg is in the linked thread. If there's any other info I can provide, let me know. Thanks, Charles