From owner-freebsd-current@FreeBSD.ORG Wed Oct 22 13:31:55 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97653106567A for ; Wed, 22 Oct 2008 13:31:55 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from mail.digiware.nl (www.tegenbosch28.nl [217.21.251.97]) by mx1.freebsd.org (Postfix) with ESMTP id 44A6A8FC20 for ; Wed, 22 Oct 2008 13:31:55 +0000 (UTC) (envelope-from wjw@withagen.nl) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 1B29417559 for ; Wed, 22 Oct 2008 15:31:54 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2N4mW1ioV1Kq for ; Wed, 22 Oct 2008 15:31:51 +0200 (CEST) Received: from [192.168.1.114] (unknown [192.168.1.114]) by mail.digiware.nl (Postfix) with ESMTP id BDD1A1750F for ; Wed, 22 Oct 2008 15:31:51 +0200 (CEST) Message-ID: <48FF2B47.7010804@withagen.nl> Date: Wed, 22 Oct 2008 15:31:51 +0200 From: Willem Jan Withagen User-Agent: Thunderbird 2.0.0.17 (Windows/20080914) MIME-Version: 1.0 References: <48F90FC1.3040503@digiware.nl> <20081018002133.GA36113@troutmask.apl.washington.edu> <48FBB431.4090102@digiware.nl> <48FCA4E4.4070508@withagen.nl> In-Reply-To: <48FCA4E4.4070508@withagen.nl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current@freebsd.org Subject: Re: SMP opteron system freezes X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Oct 2008 13:31:55 -0000 Willem Jan Withagen wrote: > Willem Jan Withagen wrote: >> Steve Kargl wrote: >>> On Sat, Oct 18, 2008 at 12:20:49AM +0200, Willem Jan Withagen wrote: >>>> I'm sort of assuming that the bge0: timeouts and coalesced links are due to the freezing. >>> >>> Does the following help? >> >> Just a little... >> It now takes a little longer for the system to freeze, but eventally it will. >> The coalesced messages did not return. >> >> Just out of curiosity is also plugged in an fxp-card. >> And there it takes even longer for the system to freeze, but in the end it does freeze. >> >> The "funny" part is it once in a while is revivable by going into the kernel-debugger and then just continue. >> Sometimes a long wait (10 sec) will suffice, during which there is no keyboard response what so ever. >> But on other instances the system is dead in the water, and only a hardware reset gets it back. >> >> Something I'm still wondering if this only is with NFS traffic, or with all other types of network traffic. But I haven't tested thids. > > Well I tested something different. > > This is a (older) dual opteron 244 system. So each chip has only one core. > And I removed one of the processors... > > Guess what: > It just runs without any problems as far as I could test. > > With 2 processors it is just enough to let init start all the nfs related stuff in /etc/rc.d and lock up the system. > > So I guess we need to look at totally different things. > Given enough time, I'll check and see whether 7.x does run without trouble. > > If somebody thinks this thread should go to amd64, just say so. > But I am running the i386 stuff. Tested 7.1-PRERELEASE, and that seems to run with mount problems. So my guess is that there is something I have in my hardware that is either really wierdly broken, or there is some other problem that is really bothering me. So I'm in the process of getting the serial console working to capture some of the traceback and stuff. People wanting to compare dmesg.8 and dmesg.7, have a look at www.tegenbosch28.nl:/FreeBSD/Toy --WjW