Date: Sat, 3 Apr 2010 20:15:09 -0500 From: Kevin Day <toasty@dragondata.com> To: Nathan Whitehorn <nwhitehorn@freebsd.org> Cc: freebsd-ppc@freebsd.org Subject: Re: Xserve G4 stability (random processes crashing) Message-ID: <E4E04270-8D31-4D25-BC01-10E93B46899C@dragondata.com> In-Reply-To: <4BB7A9B2.3080901@freebsd.org> References: <40B1BEB2-6620-4188-BB71-F8B5ED4AA234@dragondata.com> <4BB5EE68.2040504@freebsd.org> <7F22E2B9-34FB-4E3B-981E-8D2EF73A4F64@dragondata.com> <4BB7A9B2.3080901@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
>> If anything, it seems worse on -RELEASE than -STABLE. In -STABLE I = was at least able to get through a buildworld with only restarting it = once, and now in -RELEASE I've restarted about 10 times and still = haven't made it all the way through. >>=20 >> Same symptoms as before, gcc giving internal compiler errors, = segfaults, or corrupt .o files being produced. Memtester (even running = in parallel with buildworld) never reports any errors. I'll keep = fiddling with this, but if anyone has any suggestions on where to look = for some clues, it'd be appreciated. >> =20 > Since you say UP kernels have the same problems, other G4 machines = seem not to have issues, and SMP G5 Xserves are completely stable, that = points at some G4 Xserve-specific piece of hardware. I'd guess the ATA = controller. Could you try chroot to an NFS volume mounted from a = known-stable machine, or a USB or Firewire disk, and trying the same = things? > -Nathan I think you may be on to something... trying to copy /usr/src over to an = NFS mount, I got: ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - = completing request directly ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - = completing request directly ad0: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing = request directly ad0: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing = request directly ad0: WARNING - SET_MULTI taskqueue timeout - completing request directly ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=3D25113024 This was repeating slowly over and over on the console with LBA changing = each time. I'm going to do some more fiddling, but it does look ata related now.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E4E04270-8D31-4D25-BC01-10E93B46899C>