Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 3 Apr 2010 20:15:09 -0500
From:      Kevin Day <toasty@dragondata.com>
To:        Nathan Whitehorn <nwhitehorn@freebsd.org>
Cc:        freebsd-ppc@freebsd.org
Subject:   Re: Xserve G4 stability (random processes crashing)
Message-ID:  <E4E04270-8D31-4D25-BC01-10E93B46899C@dragondata.com>
In-Reply-To: <4BB7A9B2.3080901@freebsd.org>
References:  <40B1BEB2-6620-4188-BB71-F8B5ED4AA234@dragondata.com> <4BB5EE68.2040504@freebsd.org> <7F22E2B9-34FB-4E3B-981E-8D2EF73A4F64@dragondata.com> <4BB7A9B2.3080901@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
>> If anything, it seems worse on -RELEASE than -STABLE.  In -STABLE I =
was at least able to get through a buildworld with only restarting it =
once, and now in -RELEASE I've restarted about 10 times and still =
haven't made it all the way through.
>>=20
>> Same symptoms as before, gcc giving internal compiler errors, =
segfaults, or corrupt .o files being produced.  Memtester (even running =
in parallel with buildworld) never reports any errors. I'll keep =
fiddling with this, but if anyone has any suggestions on where to look =
for some clues, it'd be appreciated.
>> =20
> Since you say UP kernels have the same problems, other G4 machines =
seem not to have issues, and SMP G5 Xserves are completely stable, that =
points at some G4 Xserve-specific piece of hardware. I'd guess the ATA =
controller. Could you try chroot to an NFS volume mounted from a =
known-stable machine, or a USB or Firewire disk, and trying the same =
things?
> -Nathan


I think you may be on to something... trying to copy /usr/src over to an =
NFS mount, I got:

ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - =
completing request directly
ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - =
completing request directly
ad0: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing =
request directly
ad0: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing =
request directly
ad0: WARNING - SET_MULTI taskqueue timeout - completing request directly
ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=3D25113024

This was repeating slowly over and over on the console with LBA changing =
each time.

I'm going to do some more fiddling, but it does look ata related now.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E4E04270-8D31-4D25-BC01-10E93B46899C>