From owner-freebsd-smp@FreeBSD.ORG Thu Aug 18 12:46:06 2005 Return-Path: X-Original-To: freebsd-smp@freebsd.org Delivered-To: freebsd-smp@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A018816A41F for ; Thu, 18 Aug 2005 12:46:06 +0000 (GMT) (envelope-from girgen@FreeBSD.org) Received: from mxfep01.bredband.com (mxfep01.bredband.com [195.54.107.70]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9522443D45 for ; Thu, 18 Aug 2005 12:46:05 +0000 (GMT) (envelope-from girgen@FreeBSD.org) Received: from palle.girgensohn.se ([213.114.205.87] [213.114.205.87]) by mxfep01.bredband.com with ESMTP id <20050818124604.GBZO23053.mxfep01.bredband.com@palle.girgensohn.se>; Thu, 18 Aug 2005 14:46:04 +0200 Received: from localhost (palle.girgensohn.se [127.0.0.1]) by palle.girgensohn.se (Postfix) with ESMTP id D474F1D12D; Thu, 18 Aug 2005 14:46:03 +0200 (CEST) Received: from palle.girgensohn.se ([127.0.0.1]) by localhost (palle.girgensohn.se [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 06128-05; Thu, 18 Aug 2005 14:46:03 +0200 (CEST) Received: from palle.girgensohn.se (palle.girgensohn.se [127.0.0.1]) by palle.girgensohn.se (Postfix) with ESMTP id 9CCC21CFCE; Thu, 18 Aug 2005 14:46:03 +0200 (CEST) Date: Thu, 18 Aug 2005 14:46:03 +0200 From: Palle Girgensohn To: Rutger Bevaart Message-ID: <1FD3C2C1CA1D994795EC5288@palle.girgensohn.se> In-Reply-To: <14564.193.172.18.3.1124368244.squirrel@193.172.18.3> References: <24434.193.172.18.3.1121433324.squirrel@193.172.18.3> <54A5EA8AE63A943A718F6AF2@palle.girgensohn.se> <14564.193.172.18.3.1124368244.squirrel@193.172.18.3> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: by amavisd-new at pingpong.net Cc: freebsd-smp@freebsd.org, Rutger Bevaart Subject: Re: FreeBSD unstable on Dell 1750 using SMP? X-BeenThere: freebsd-smp@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD SMP implementation group List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Aug 2005 12:46:06 -0000 --On torsdag, augusti 18, 2005 14.30.44 +0200 Rutger Bevaart wrote: > It seems that updating our machine to 5.4-p5 (RELEND_5_4) has solved this, > or at least made it occur less frequently. Our last reboot was after > building and installing the new kernel and it hasn't gone down since. Very interesting. We're still at 5.4-p1. The version bump fixes didn't look like they were addressing stability, only security, but why not... > This > is with SMP, ACPI and HT enabled on a Dell 1750 with two 3GHz Xeons. Pretty identical to our system. > The > 2850 has been rock-stable running 5.4-p3. And you never ran previous versions on that system? > Whatever is was, it seems to > have been fixed around that time. > > Could be that your issues are amd64 related. We run the i386 branch > because we need stable systems, not 64bit. I have indications that the problems have occured equally on i386 and amd64, and that amd64 is considered stable, but that might not be quite true? Regards, Palle > The issue still persists on 4.11 though. Can somebody explain what the > ACPI fixes were around that time and if they will be backported to 4.X? > > Regards > Rutger Bevaart > > On Thu, August 18, 2005 1:55, Palle Girgensohn said: >> >> >> --On fredag, juli 15, 2005 15.15.24 +0200 Rutger Bevaart >> wrote: >> >>> >>> hello list, >>> >>> For the past year we've been running several Dell PowerEdge 1750 servers >>> on FreeBSD 4.10, 4.11 and 5.3. All these machines have dual Xeons >>> running >>> with HT enabled. This install has proven to be unstable in that the >>> machine will reboot between 3 days and 170 days without apparant reason. >>> No log is written. Other machines we have with a single CPU (HT enabled) >>> do not experience this problem. >>> >>> As it is present in both 4.x and 5.x and googling the last year has not >>> revealed similar experience I'm consulting this list. As all of these >>> machines are productions machines that have a continuous load (not >>> heavly >>> load, but a light average - some peaks) it's not easy to experiment with >>> HT setting etc. I dislike driving to the datacenter for locked systems >>> with fubarred kernels ;-) >>> >>> The only error i've ever seen just before a reboot is "bge0: discard >>> frame >>> w/o packet header" on the 5.3 machine. >> >> Late comment while browsing the list for tips... >> >> No good clues, I'm afraid, but we have a 2850, and it is far from stable, >> crashing within hours when running SMP, often but not always under high >> load. Single CPU works like a charm. This is very annoying, to say the >> least. See my posts on amd64@ around June 15. >> >> FreeBSD 5.4p1 (amd64). Dell 2850 with dual Xeon CPUS, EM64T. >> >> /Palle >> >> > > > Rutger Bevaart :: illian.networks >