Date: Mon, 5 Apr 2010 19:57:18 -0500 From: Kevin Day <toasty@dragondata.com> To: Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: freebsd-ppc@freebsd.org Subject: Re: Xserve G4 stability (random processes crashing) Message-ID: <2FD96EE6-1761-4040-9E5A-58A33DE1D030@dragondata.com> In-Reply-To: <4BBA2BD8.9050003@freebsd.org> References: <40B1BEB2-6620-4188-BB71-F8B5ED4AA234@dragondata.com> <4BB5EE68.2040504@freebsd.org> <7F22E2B9-34FB-4E3B-981E-8D2EF73A4F64@dragondata.com> <4BB7A9B2.3080901@freebsd.org> <CD09F9CA-2A98-479E-9C96-2DFDAEA42731@dragondata.com> <4BBA2BD8.9050003@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 5, 2010, at 1:28 PM, Nathan Whitehorn wrote: > Kevin Day wrote: >> On Apr 3, 2010, at 3:48 PM, Nathan Whitehorn wrote: >> =20 >>> Since you say UP kernels have the same problems, other G4 machines = seem not to have issues, and SMP G5 Xserves are completely stable, that = points at some G4 Xserve-specific piece of hardware. I'd guess the ATA = controller. Could you try chroot to an NFS volume mounted from a = known-stable machine, or a USB or Firewire disk, and trying the same = things? >>> -Nathan >>> =20 >>=20 >> Okay, i've done some more playing... The problem still happens even = if TMPDIR, /usr/src and /usr/obj are NFS mounted to another system. >>=20 >> I'm fiddling more, but I think that rules out ATA then.=20 >> The problem seems to take a long while to first appear, but once it = does appear it happens pretty fast repeatedly after that. Is it possible = the fan controls aren't working right? >> =20 > That's possible. The fan control settings are done completely by = hardware, though. Can you try with the whole system on NFS (i.e. a = chroot or netbooting)? > -Nathan Even pure NFS (running inside a jail with all of the jail chroot over = NFS) was still crashing. But, I think I may have figured out the issue... This box only has 1GB of DIMMs installed, but FreeBSD is somehow seeing = 1.25GB of RAM and is apparently trying to use it. If I put 2GB of RAM in = there, it correctly detects 2GB and (so far) buildworld is running fine = after three reboots. Mac OS X is only seeing 1GB, and seems to reliably detect that. I'm = going to do some more digging to figure out where the wrong memory size = is coming from.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2FD96EE6-1761-4040-9E5A-58A33DE1D030>