From owner-freebsd-ppc@FreeBSD.ORG Sun Apr 4 01:15:14 2010 Return-Path: Delivered-To: freebsd-ppc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48160106564A; Sun, 4 Apr 2010 01:15:14 +0000 (UTC) (envelope-from toasty@dragondata.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id DCE508FC15; Sun, 4 Apr 2010 01:15:13 +0000 (UTC) Received: by gwaa20 with SMTP id a20so1219718gwa.13 for ; Sat, 03 Apr 2010 18:15:13 -0700 (PDT) Received: by 10.101.204.37 with SMTP id g37mr5691009anq.28.1270343711947; Sat, 03 Apr 2010 18:15:11 -0700 (PDT) Received: from vpn177.ord02.your.org (vpn177.ord02.your.org [204.9.55.177]) by mx.google.com with ESMTPS id 21sm250916iwn.15.2010.04.03.18.15.10 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 03 Apr 2010 18:15:11 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii From: Kevin Day In-Reply-To: <4BB7A9B2.3080901@freebsd.org> Date: Sat, 3 Apr 2010 20:15:09 -0500 Content-Transfer-Encoding: quoted-printable Message-Id: References: <40B1BEB2-6620-4188-BB71-F8B5ED4AA234@dragondata.com> <4BB5EE68.2040504@freebsd.org> <7F22E2B9-34FB-4E3B-981E-8D2EF73A4F64@dragondata.com> <4BB7A9B2.3080901@freebsd.org> To: Nathan Whitehorn X-Mailer: Apple Mail (2.1077) Cc: freebsd-ppc@freebsd.org Subject: Re: Xserve G4 stability (random processes crashing) X-BeenThere: freebsd-ppc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the PowerPC List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 04 Apr 2010 01:15:14 -0000 >> If anything, it seems worse on -RELEASE than -STABLE. In -STABLE I = was at least able to get through a buildworld with only restarting it = once, and now in -RELEASE I've restarted about 10 times and still = haven't made it all the way through. >>=20 >> Same symptoms as before, gcc giving internal compiler errors, = segfaults, or corrupt .o files being produced. Memtester (even running = in parallel with buildworld) never reports any errors. I'll keep = fiddling with this, but if anyone has any suggestions on where to look = for some clues, it'd be appreciated. >> =20 > Since you say UP kernels have the same problems, other G4 machines = seem not to have issues, and SMP G5 Xserves are completely stable, that = points at some G4 Xserve-specific piece of hardware. I'd guess the ATA = controller. Could you try chroot to an NFS volume mounted from a = known-stable machine, or a USB or Firewire disk, and trying the same = things? > -Nathan I think you may be on to something... trying to copy /usr/src over to an = NFS mount, I got: ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - = completing request directly ad0: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - = completing request directly ad0: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing = request directly ad0: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing = request directly ad0: WARNING - SET_MULTI taskqueue timeout - completing request directly ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=3D25113024 This was repeating slowly over and over on the console with LBA changing = each time. I'm going to do some more fiddling, but it does look ata related now.