From owner-freebsd-ppc@FreeBSD.ORG Wed Jun 29 15:26:09 2011 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08E3B1065673; Wed, 29 Jun 2011 15:26:09 +0000 (UTC) (envelope-from paul@gromit.dlib.vt.edu) Received: from lennier.cc.vt.edu (lennier.cc.vt.edu [198.82.162.213]) by mx1.freebsd.org (Postfix) with ESMTP id B2BFE8FC16; Wed, 29 Jun 2011 15:26:08 +0000 (UTC) Received: from zidane.cc.vt.edu (zidane.cc.vt.edu [198.82.163.227]) by lennier.cc.vt.edu (8.13.8/8.13.8) with ESMTP id p5TFPbCC027773; Wed, 29 Jun 2011 11:25:37 -0400 Received: from auth3.smtp.vt.edu (EHLO auth3.smtp.vt.edu) ([198.82.161.152]) by zidane.cc.vt.edu (MOS 4.2.2-FCS FastPath queued) with ESMTP id PUH86913; Wed, 29 Jun 2011 11:25:37 -0400 (EDT) Received: from pmather.tower.lib.vt.edu (pmather.tower.lib.vt.edu [128.173.51.28]) (authenticated bits=0) by auth3.smtp.vt.edu (8.13.8/8.13.8) with ESMTP id p5TFPVO5032140 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 29 Jun 2011 11:25:32 -0400 Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Paul Mather In-Reply-To: <4E089692.30203@freebsd.org> Date: Wed, 29 Jun 2011 11:25:31 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <38D89FC6-13F1-4AEF-AF41-0A377EE49DC4@gromit.dlib.vt.edu> <4DFFDEEE.40200@freebsd.org> <4E02C593.6020405@freebsd.org> <4E0682D3.9070607@freebsd.org> <4E089692.30203@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1084) X-Mirapoint-Received-SPF: 198.82.161.152 auth3.smtp.vt.edu paul@gromit.dlib.vt.edu 5 none X-Junkmail-Info: (0) X-Junkmail-Status: score=10/50, host=zidane.cc.vt.edu X-Junkmail-Signature-Raw: score=unknown, refid=str=0001.0A020202.4E0B43F1.0166,ss=1,fgs=0, ip=0.0.0.0, so=2010-07-22 22:03:31, dmn=2009-09-10 00:05:08, mode=multiengine X-Junkmail-IWF: false Cc: freebsd-ppc@freebsd.org Subject: Re: Xserve G5 keeps shutting down X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jun 2011 15:26:09 -0000 On Jun 27, 2011, at 10:41 AM, Nathan Whitehorn wrote: > On 06/27/11 09:28, Paul Mather wrote: >> On Jun 25, 2011, at 8:52 PM, Nathan Whitehorn wrote: >> What is odd, to me, is that this power-off occurred even after >> commenting out the shutdown line in the thermal power management >> driver. So, it must be something else that is forcibly powering off >> the system, maybe something in OpenFirmware, rather like some PC >> BIOSes will initiate a power-off when temperatures exceed critical. >> But, what is definitely odd is that temperatures don't seem to get so >> high as to be critical, so perhaps it is some other hardware state >> that is triggering the power-off? >=20 > Yes, something like that seems likely, either some strange firmware = thing or the PMU microcontroller becoming unhappy. A few more things to = try: > 1) Does setting machdep.manage_fans=3D0 from the loader change = anything? I am trying that now. Luckily, this is a RW sysctl because I cannot = appear to set this, neither at the boot loader prompt nor in = /boot/loader.conf, and have the value stick to what I set. It always = seems to end up as machdep.manage_fans=3D1 (the default) by the time the = system is booted and I have to change it back to machdep.manage_fans=3D0 = manually. Is the /boot/loader.conf mechanism broken on FreeBSD/powerpc64? > 2) Change the pause in powermac_thermal.c line 86 from hz to 2*hz or = even 3*hz, in case the firmware has a bug similar to the one we used to = have, and BSD is saturating the I2C sensors so that firmware can't read = the temperature. I tried this, changing the pause to 3*hz as suggested, but it appeared = to have no effect: the system still powered off after a while. > 3) Add RackMac3,1 to the list of systems on which the firmware is = quiesced at line 517 of /sys/powerpc/ofw/ofw_machdep.c I've not tried this, yet. The other thing I have tried since my last posting is to run the Apple = Remote Diagnostics 1.0.4. The Xserve G5 in question passed all 169 of = the extended tests run on it. I have a copy of the test output log, if = interested. What kind of system temperatures are other Xserve G5 users getting? I = have no idea what is normal and so whether or not I should run the fans = higher. I am currently running the fans at 6000 rpm with = machdep.manage_fans=3D0, mainly as a good compromise between noise and = cooling. (The system is in my office whilst I set it up and configure = it---I was about to deploy the machine when the latest power-offs = stymied that.) With the system largely idle, these are the temperatures being reported = with the fans running at ~6000 rpm: dev.max6690.0.sensor.sys_ctrlr_ambient.temp: 40.0C dev.max6690.0.sensor.sys_ctrlr_internal.temp: 51.7C dev.ad7417.0.sensor.cpu_a_ad7417_amb.temp: 33.7C dev.ad7417.0.sensor.cpu_a_diode_temp.temp: 45.9C dev.ad7417.1.sensor.cpu_b_ad7417_amb.temp: 31.5C dev.ad7417.1.sensor.cpu_b_diode_temp.temp: 45.1C This is what the temperatures look like about 30 minutes into a "make = -j5 buildworld" (uptime: "11:23AM up 1:03, 2 users, load averages: = 6.52, 6.52, 5.23"): dev.max6690.0.sensor.sys_ctrlr_ambient.temp: 44.0C dev.max6690.0.sensor.sys_ctrlr_internal.temp: 56.3C dev.ad7417.0.sensor.cpu_a_ad7417_amb.temp: 40.0C dev.ad7417.0.sensor.cpu_a_diode_temp.temp: 64.9C dev.ad7417.1.sensor.cpu_b_ad7417_amb.temp: 36.0C dev.ad7417.1.sensor.cpu_b_diode_temp.temp: 62.6C Again, that is with the fans running at about 6000 rpm. Cheers, Paul.