From owner-freebsd-ppc@FreeBSD.ORG Thu Jan 17 20:59:30 2013 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CDDA3FDF for ; Thu, 17 Jan 2013 20:59:30 +0000 (UTC) (envelope-from mrezny@hexaneinc.com) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [217.70.183.195]) by mx1.freebsd.org (Postfix) with ESMTP id 9347EBA for ; Thu, 17 Jan 2013 20:59:30 +0000 (UTC) X-Originating-IP: 10.0.10.73 Received: from localhost (front3-v.mgt.gandi.net [10.0.10.73]) by relay3-d.mail.gandi.net (Postfix) with ESMTP id E2456A80B4 for ; Thu, 17 Jan 2013 21:59:12 +0100 (CET) MIME-Version: 1.0 X-Mailer: Webmail Message-ID: <7700.1358456352@hexaneinc.com> To: Content-Type: text/plain; charset="utf-8" X-Origin: 81.90.254.28 Date: Thu, 17 Jan 2013 21:59:12 +0100 Subject: PowerMac G5 spurious sensor readings From: Matthew Rezny Content-Transfer-Encoding: quoted-printable X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Matthew Rezny List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Jan 2013 20:59:30 -0000 I have a G5 of the first model (PowerMac7,2) on which I've been using FreeB= SD/ppc64 for over a year. Today, it suddenly rebooted. Not the first time b= y any means, but this is the first time I found the following log message: Jan 17 17:32:19 powermac kernel: WARNING: Current temperature (MLB MAX6690 = AMB:127.8 C) exceeds critical temperature (80.0 C)! Shutting down! This is the first time I have seen such a message. After reboot, that senso= r shows a temperature near 30C, which seems appropriate. The reading of 127= .8C looks suspiciously like a max value. My only guess is there was a bad r= ead that resulted in=20 the sensor value going over the threshold. That raises a question in my min= d as to whether there is any filtering or sanity checking of the data. Coul= d a single bad read cause the threshold to be exceeded and trigger shutdown= immediately, or would=20 the excessive value have to be returned from that sensor multiple times for= it to be believed an acted upon? $ uname -a FreeBSD powermac 9.1-RC1 FreeBSD 9.1-RC1 #0: Thu Aug 16 00:43:39 UTC 2012 = root@anacreon.physics.wisc.edu:/usr/obj/usr/src/sys/GENERIC64 powerpc The build is a bit old, though I wouldn't expect too much change to the cod= e in question since then. I will update to 9.1-RELEASE or -STABLE in the ne= xt few days, but as this is a problem that has happened once in over a year= , I wouldn't call it=20 resolved just by a quick failure to reproduce after updating. I was already planning to do an update after the box has completed it's cur= rent task. I noticed a problem with excessive output causing the console to= hang. A couple days ago I found the machine apparently hung in that the ke= yboard and mouse were=20 not responsive, but I found it was still alive on the network and I could s= sh in to reboot. The only clues were no buffer space for dmesg to output an= ything before reboot, and a rather full /var/log/messages file which had ex= hausted the drive.=20 Under the same workload (and after freeing some drive space), the problem r= eoccurred in a matter of hours, but this time with me watching. While runni= ng ddrescue against a drive with some bad sectors, read errors flood the co= nsole in spurts. When=20 some dozens of read errors are displayed at once, the console scrolls whole= pages by in a fraction of a second, and then goes dead. Messages that shou= ld go to console are not shown on screen but are in the log. Attempts to sw= itch virtual console or=20 to reboot are not successful, but ssh access continues to work and the box = is clearly still processing other workloads. The only sign of life from the= console are the messages about flushing buffers just before completion of = the reboot commanded=20 via ssh.