From owner-freebsd-questions@FreeBSD.ORG Thu Apr 5 16:45:24 2007 Return-Path: X-Original-To: freebsd-questions@freebsd.org Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1B7F316A402 for ; Thu, 5 Apr 2007 16:45:24 +0000 (UTC) (envelope-from lists@lizardhill.com) Received: from kermit.lizardhill.com (kermit.lizardhill.com [64.69.41.217]) by mx1.freebsd.org (Postfix) with ESMTP id 083BE13C46A for ; Thu, 5 Apr 2007 16:45:24 +0000 (UTC) (envelope-from lists@lizardhill.com) Received: from ip72-193-85-114.lv.lv.cox.net ([72.193.85.114] helo=mickey) by kermit.lizardhill.com with esmtpa (Exim 4.66) (envelope-from ) id 1HZV4f-00017a-Ff for freebsd-questions@freebsd.org; Thu, 05 Apr 2007 09:44:37 -0700 From: "Don O'Neil" To: References: <447it12z00.fsf@be-well.ilk.org> Date: Thu, 5 Apr 2007 09:45:15 -0700 Message-ID: <00de01c777a1$c9bf83c0$0600020a@mickey> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcdxSC7LriCmwoaLQJWxKgsJxZBnJQGWBQgg X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 In-Reply-To: Subject: RE: Problems with SMP on 6.1-STABLE-200608 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Apr 2007 16:45:24 -0000 More info on my problem..... I swapped out the MB, CPU's, RAM, Power Supply and I still have the problem with the kernel panicing when running on SMP. When I re-build the kernel for NO SMP, the machine is rock solid, even under VERY high loads. I setup the old MB, CPU's, RAM & Power Supply on the bench, with a new 6.1-STABLE-200608 AND 6.2-RELEASE install and run dozens of copies of the stress port. Even with it bringing loads up to >250, and eating up all available RAM and SWAP I could not get the kernel to panic. The ONLY difference between the bench setup and the production setup is a 3-Ware Escalade RAID card. I am going to setup another array on the bench with a spare card I have and see if I can get it to panic under that setup (which will be identical hardware wise to the production box). The only thing I can think of right now is one of the following: 1) Bad RAID card or cables <- unlikely since it should show up even in uniprocessor mode 2) Problem with the TWE driver in SMP mode <- more likely I'm leaning towards #2, especially with the other recent reports of someone else getting kernel panics with 3ware products. Anyone else have any thoughts as to what scenarios/tools I should try to isolate the problem? -----Original Message----- From: owner-freebsd-questions@freebsd.org [mailto:owner-freebsd-questions@freebsd.org] On Behalf Of youshi10@u.washington.edu Sent: Wednesday, March 28, 2007 8:48 AM To: freebsd-questions@freebsd.org Subject: Re: Problems with SMP on 6.1-STABLE-200608 On Wed, 28 Mar 2007, Lowell Gilbert wrote: > "Don O'Neil" writes: > >> I've been having problems with my server freezing up, having the #2 >> CPU 'shut down', kernel panics, and all sorts of nastyness.... >> >> Originally I thought it was exim, or possibly bind, or bad hardware >> (mb, cpu or memory)... I've swapped out the motherboard & CPU's & >> memory from an old server that was running 4.11 ROCK SOLID for years... >> >> At first I thought the problem was solved, but now it's popping up again... >> The 2nd CPU gets 'shut down', or kernel panics, esentially taking the >> system offline. > > There are lots of things this could be, and I certainly wouldn't rule > out hardware problems (power supply?). Figuring out the problems > directly would certainly involve looking at more details than you're > listing here. > >> If I install a single CPU (non-smp) kernel, then the system works >> fine... (I did this on the old motherboard before I swapped it out, >> and it worked fine too).. So I'm wondering if there is an SMP bug or problem I'm running into. >> >> I'm running 6.1-STABLE-200608, an ISO image I downloaded from the >> archives when I built the box (NOT 6.1-RELEASE). > > The whole point of making releases is that it's much easier to support > a small number of known reference software configurations. > >> I'm runining an Intel Serverworks motherboard with 2 1.4 GHz >> PIII's... The problem only seems to show up under high load. > > I don't think I've heard of anything similar. I think there are a > bunch of these boards out there. > >> I'm wondering what I should do here... >> >> I'm concerned about doing a binary upgrade to 6.2 won't fix the >> problem, and I've tried using freebsd-update, but it complains about >> the version not being compatible. >> >> If I do a binary upgrade from CD, will it also update the kernel >> sources so I can build a new one? Will it complain about it not being compatible? > > It can give you the sources; that's a menu option during the install. > That should work fine. > >> Is there a way to 'force' the ID of the system to be 6.1-RELEASE so >> that freebsd-update will work? > > Well, yes, but there's a reason for the check, you know... > >> Will doing the 6.1-6.2 binary upgrade as posted by Colin also update >> the kernel sources? > > I don't know what procedure he described, so I don't know. But if you > update to 6.2-RELEASE, then it will be easy to get the right sources > afterwards. Again, that is the advantage of having releases. > >> Would my best option really be to start over with a fresh install >> rather than upgrade? (this would be painful) > > If it's that painful, you'd probably be well served to have a spare > system to stage changes on. In addition to being good risk > management, it saves you time, which is worth something too. > >> I'm going to try to test out 6.2 on the old MB/CPU combo to see if I >> can re-create it under 6.2 as well before I do anything. As well as >> try doing an upgrade on the bench from CD from 6.1-STABLE-200608 to >> 6.2-RELEASE... Since this is a production server (and for months it >> was burned in with no apparent issues) I only have 1 shot at this to do it right. >> >> Any help/recomendation would be appreciated. > > Good luck. Honestly I would probe around your motherboard a bit checking voltages (power supply) and/or heat dissipation, because those are the most likely cases if it _only_ fails under high load. Next thing to check would be RAM integrity. -Garrett _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscribe@freebsd.org"