Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jun 2011 11:25:31 -0400
From:      Paul Mather <paul@gromit.dlib.vt.edu>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        freebsd-ppc@freebsd.org
Subject:   Re: Xserve G5 keeps shutting down
Message-ID:  <FC9CB784-FC95-4A69-869A-B0EB6BF54AEB@gromit.dlib.vt.edu>
In-Reply-To: <4E089692.30203@freebsd.org>
References:  <38D89FC6-13F1-4AEF-AF41-0A377EE49DC4@gromit.dlib.vt.edu> <4DFFDEEE.40200@freebsd.org> <E5EE3F19-79AB-417C-A7EE-0F95CE9DB921@gromit.dlib.vt.edu> <4E02C593.6020405@freebsd.org> <C9D9D81E-92B9-4CF3-8DF7-A131DDC34F50@gromit.dlib.vt.edu> <4E0682D3.9070607@freebsd.org> <F05548CE-2EAE-48EA-BA9E-CCD371EB1790@gromit.dlib.vt.edu> <4E089692.30203@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jun 27, 2011, at 10:41 AM, Nathan Whitehorn wrote:

> On 06/27/11 09:28, Paul Mather wrote:
>> On Jun 25, 2011, at 8:52 PM, Nathan Whitehorn wrote:

>> What is odd, to me,  is that this power-off occurred even after
>> commenting out the shutdown line in the thermal power management
>> driver.  So, it must be something else that is forcibly powering off
>> the system, maybe something in OpenFirmware, rather like some PC
>> BIOSes will initiate a power-off when temperatures exceed critical.
>> But, what is definitely odd is that temperatures don't seem to get so
>> high as to be critical, so perhaps it is some other hardware state
>> that is triggering the power-off?
>=20
> Yes, something like that seems likely, either some strange firmware =
thing or the PMU microcontroller becoming unhappy. A few more things to =
try:
> 1) Does setting machdep.manage_fans=3D0 from the loader change =
anything?

I am trying that now.  Luckily, this is a RW sysctl because I cannot =
appear to set this, neither at the boot loader prompt nor in =
/boot/loader.conf, and have the value stick to what I set.  It always =
seems to end up as machdep.manage_fans=3D1 (the default) by the time the =
system is booted and I have to change it back to machdep.manage_fans=3D0 =
manually.

Is the /boot/loader.conf mechanism broken on FreeBSD/powerpc64?

> 2) Change the pause in powermac_thermal.c line 86 from hz to 2*hz or =
even 3*hz, in case the firmware has a bug similar to the one we used to =
have, and BSD is saturating the I2C sensors so that firmware can't read =
the temperature.

I tried this, changing the pause to 3*hz as suggested, but it appeared =
to have no effect: the system still powered off after a while.

> 3) Add RackMac3,1 to the list of systems on which the firmware is =
quiesced at line 517 of /sys/powerpc/ofw/ofw_machdep.c

I've not tried this, yet.

The other thing I have tried since my last posting is to run the Apple =
Remote Diagnostics 1.0.4.  The Xserve G5 in question passed all 169 of =
the extended tests run on it.  I have a copy of the test output log, if =
interested.

What kind of system temperatures are other Xserve G5 users getting?  I =
have no idea what is normal and so whether or not I should run the fans =
higher.  I am currently running the fans at 6000 rpm with =
machdep.manage_fans=3D0, mainly as a good compromise between noise and =
cooling.  (The system is in my office whilst I set it up and configure =
it---I was about to deploy the machine when the latest power-offs =
stymied that.)

With the system largely idle, these are the temperatures being reported =
with the fans running at ~6000 rpm:

dev.max6690.0.sensor.sys_ctrlr_ambient.temp: 40.0C
dev.max6690.0.sensor.sys_ctrlr_internal.temp: 51.7C
dev.ad7417.0.sensor.cpu_a_ad7417_amb.temp: 33.7C
dev.ad7417.0.sensor.cpu_a_diode_temp.temp: 45.9C
dev.ad7417.1.sensor.cpu_b_ad7417_amb.temp: 31.5C
dev.ad7417.1.sensor.cpu_b_diode_temp.temp: 45.1C

This is what the temperatures look like about 30 minutes into a "make =
-j5 buildworld" (uptime: "11:23AM  up  1:03, 2 users, load averages: =
6.52, 6.52, 5.23"):

dev.max6690.0.sensor.sys_ctrlr_ambient.temp: 44.0C
dev.max6690.0.sensor.sys_ctrlr_internal.temp: 56.3C
dev.ad7417.0.sensor.cpu_a_ad7417_amb.temp: 40.0C
dev.ad7417.0.sensor.cpu_a_diode_temp.temp: 64.9C
dev.ad7417.1.sensor.cpu_b_ad7417_amb.temp: 36.0C
dev.ad7417.1.sensor.cpu_b_diode_temp.temp: 62.6C

Again, that is with the fans running at about 6000 rpm.

Cheers,

Paul.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FC9CB784-FC95-4A69-869A-B0EB6BF54AEB>