From owner-freebsd-stable@FreeBSD.ORG Mon Jul 21 21:17:38 2008 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC53E1065673 for ; Mon, 21 Jul 2008 21:17:38 +0000 (UTC) (envelope-from hostmaster@netconsonance.com) Received: from mail.netconsonance.com (mail.netconsonance.com [198.207.204.4]) by mx1.freebsd.org (Postfix) with ESMTP id 856798FC19 for ; Mon, 21 Jul 2008 21:17:38 +0000 (UTC) (envelope-from hostmaster@netconsonance.com) Received: from [10.66.240.106] (public-wireless.sv.svcolo.com [64.13.135.30]) (authenticated bits=0) by mail.netconsonance.com (8.14.1/8.14.1) with ESMTP id m6LLHZBa004129; Mon, 21 Jul 2008 14:17:35 -0700 (PDT) (envelope-from hostmaster@netconsonance.com) X-Virus-Scanned: amavisd-new at netconsonance.com X-Spam-Flag: NO X-Spam-Score: -1.801 X-Spam-Level: X-Spam-Status: No, score=-1.801 tagged_above=-999 required=3.5 tests=[ALL_TRUSTED=-1.44, AWL=-0.361] Message-Id: From: Jo Rhett To: Peter Wemm In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v928.1) Date: Mon, 21 Jul 2008 14:17:29 -0700 References: <20080711164939.GA10238@lava.net> X-Mailer: Apple Mail (2.928.1) Cc: FreeBSD Stable Subject: chipset causing locks. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jul 2008 21:17:38 -0000 Thanks for the note. No, just a coincidence. The chipset is a VIA ProSavageDDR KM266. But thanks for bringing that up ;-) FWIW, as others have speculated enabling more logging from GEOM produced nothing. It does appear to be a hardware failure of some sort. On Jul 18, 2008, at 11:29 PM, Peter Wemm wrote: > On Wed, Jul 16, 2008 at 2:42 PM, Jo Rhett > wrote: >>> On Fri, Jul 11, 2008 at 12:59:33AM -0700, Jo Rhett wrote: >>>> >>>> Every time it is rebuilding ad0. Every single boot in the last >>>> two >>>> weeks. >> >> On Jul 11, 2008, at 9:49 AM, Clifton Royston wrote: >>> >>> That just means that it halted without a proper shutdown. If it >>> crashes, the mirror isn't stopped properly, so it's marked dirty, >>> so it >>> must rebuild it. It is the precise analogy of finding all the file >>> systems dirty on boot and fscking them, following a crash. >> >> >> Thanks for the clarification. Dang, I hoped I was on to something. > > This is really off on a tangent, but I thought I'd mention it on the > off-chance that it fit your problem. > > Recently there have been grumblings about heat problems with certain > nvidia chipsets on consumer boards. Apparently, there is some process > issue, if you believe trade rags like theinquirer.net etc. Apparently > there is some issue with heat damage over time. Consumer motherboards > with passive cooled (no fan) heat pipes etc seem to be particularly > vulnerable. I use the word "apparently" because it is far from a > verified fact. > > However, I've got two motherboards, one running freebsd, one running > windows, with nvidia chipsets. Both used to be fine with onboard IDE > activity. Both now use raid controllers so the IDE interfaces have > been idle for a good year or so. > > Something came up and I had to use the IDE interfaces for a lot of > data transfer. Suddenly, both machines are flakey. The windows > machine blue screens under load. My freebsd box just "turns off" > (motherboard appears to power off, but the power supply is on still). > The same happens when I use a linux boot disk, so I know its not > FreeBSD's fault. > > The common factor seems to be that the motherboards are now about a > year and a half old. They both have the same nvidia south bridge that > theinquirer.net was trashing. Both used to work fine, now have > problems with IDE. and now I recalled the article and started > wondering... > > Do you, by any wildly remote chance, have an nvidia based motherboard? > > I believe the fault I'm seeing is the system asserting a fatal error > by doing a HT ECC flood to halt everything. > > -- > Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com; > KI6FJV > "All of this is for nothing if we don't go to the stars" - JMS/B5 > "If Java had true garbage collection, most programs would delete > themselves upon execution." -- Robert Sewell -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness