From owner-freebsd-current@FreeBSD.ORG Mon May 31 10:50:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 56C9B16A4CE for ; Mon, 31 May 2004 10:50:22 -0700 (PDT) Received: from carver.gumbysoft.com (carver.gumbysoft.com [66.220.23.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4755D43D41 for ; Mon, 31 May 2004 10:50:22 -0700 (PDT) (envelope-from dwhite@gumbysoft.com) Received: by carver.gumbysoft.com (Postfix, from userid 1000) id 2AACD72DCB; Mon, 31 May 2004 10:50:13 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by carver.gumbysoft.com (Postfix) with ESMTP id 25C1A72DB5; Mon, 31 May 2004 10:50:13 -0700 (PDT) Date: Mon, 31 May 2004 10:50:13 -0700 (PDT) From: Doug White To: Don Bowman In-Reply-To: Message-ID: <20040531104555.E95992@carver.gumbysoft.com> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: "'current@freebsd.org'" Subject: RE: hang with raid, postgresql X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 31 May 2004 17:50:22 -0000 On Sun, 30 May 2004, Don Bowman wrote: > From: Doug White [mailto:dwhite@gumbysoft.com] > > On Sun, 30 May 2004, Don Bowman wrote: > > > > > > > > I have a system with 2x 2.8GHz XEON (P4), intel e7501 chipset, > > > 4GB of ram, aac [adaptec 2200s] raid with 4 scsi > > > disks. I have also tried asr (adaptec 2015). > > > I have tried two different motherboards. > > > The only application the machine runs is postgresql, > > > with about ~30 databases, about ~250GB of data. > > > > > > I'm finding the machine locks up solid once a day > > > or so (sometimes more, sometimes less, no pattern > > > of time of day). I know its not a hardware issue, it > > > is reliable with FreeBSD 4.7. I've run through memory > > > test, disk test, etc. > > > > > > There appears to be a correlation between > > > disk activity (postgresql vacuum) and the lockup, > > > but i can't be sure. > > > > Temperature? > > > > What motherboard is it exactly? > > lmmon shows the mobo temperature @ 28C. It is in > an AC-controlled environment (~20C ambient). The system > has 6 blower fans, ducted over the CPU's, with the > copper heat sinks designed for the 3.2GHz XEON. alright so its a pretty beefy server chassis, although it could also be an underperforming power supply or a scsi terminator. > It has 3 power supplies, each with separate AC > inlet, fed from a UPS with filtered power. > It should have ~150% airflow redundancy, and > ~200% power redundancy. > This is a supermicro X5DPE motherboard. Do you happen to have the IPMI option board for this system? > http://www.supermicro.com/products/chassis/3U/933/SC933S2-R760.cfm > shows the system. Thats the chassis :-) > It was tested for ~1week with FreebSD 4.7 > at temperature in an environmental chamber, > including cycling into memtest86 every 2 hours. > > I've been battling this hang for ~6weeks, this is > a swap-out of all the hardware (new system). Still seems hardware-related to me, although I've found hard hangs caused by buggy optimization on amd64. -- Doug White | FreeBSD: The Power to Serve dwhite@gumbysoft.com | www.FreeBSD.org