From owner-freebsd-current@FreeBSD.ORG Mon Oct 20 16:13:31 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C14F1065674 for ; Mon, 20 Oct 2008 16:13:31 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (troutmask.apl.washington.edu [128.208.78.105]) by mx1.freebsd.org (Postfix) with ESMTP id D705E8FC13 for ; Mon, 20 Oct 2008 16:13:30 +0000 (UTC) (envelope-from sgk@troutmask.apl.washington.edu) Received: from troutmask.apl.washington.edu (localhost.apl.washington.edu [127.0.0.1]) by troutmask.apl.washington.edu (8.14.3/8.14.3) with ESMTP id m9KGDU2e058936; Mon, 20 Oct 2008 09:13:30 -0700 (PDT) (envelope-from sgk@troutmask.apl.washington.edu) Received: (from sgk@localhost) by troutmask.apl.washington.edu (8.14.3/8.14.3/Submit) id m9KGDUld058935; Mon, 20 Oct 2008 09:13:30 -0700 (PDT) (envelope-from sgk) Date: Mon, 20 Oct 2008 09:13:30 -0700 From: Steve Kargl To: Willem Jan Withagen Message-ID: <20081020161330.GA58868@troutmask.apl.washington.edu> References: <48F90FC1.3040503@digiware.nl> <20081018002133.GA36113@troutmask.apl.washington.edu> <48FBB431.4090102@digiware.nl> <48FCA4E4.4070508@withagen.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <48FCA4E4.4070508@withagen.nl> User-Agent: Mutt/1.4.2.3i Cc: current@freebsd.org Subject: Re: SMP opteron system freezes, Was: Re: Freezing or stalling current system X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Oct 2008 16:13:31 -0000 On Mon, Oct 20, 2008 at 05:33:56PM +0200, Willem Jan Withagen wrote: > Willem Jan Withagen wrote: > >Steve Kargl wrote: > >>On Sat, Oct 18, 2008 at 12:20:49AM +0200, Willem Jan Withagen wrote: > >>>I'm sort of assuming that the bge0: timeouts and coalesced links are > >>>due to the freezing. > >> > >>Does the following help? > >> > > > >Just a little... > >It now takes a little longer for the system to freeze, but eventally it > >will. > >The coalesced messages did not return. > > > >Just out of curiosity is also plugged in an fxp-card. > >And there it takes even longer for the system to freeze, but in the end > >it does freeze. > > > >The "funny" part is it once in a while is revivable by going into the > >kernel-debugger and then just continue. > >Sometimes a long wait (10 sec) will suffice, during which there is no > >keyboard response what so ever. > >But on other instances the system is dead in the water, and only a > >hardware reset gets it back. > > > >Something I'm still wondering if this only is with NFS traffic, or with > >all other types of network traffic. But I haven't tested thids. > > Well I tested something different. > > This is a (older) dual opteron 244 system. So each chip has only one core. > And I removed one of the processors... > > Guess what: > It just runs without any problems as far as I could test. > > With 2 processors it is just enough to let init start all the nfs related > stuff in /etc/rc.d and lock up the system. > > So I guess we need to look at totally different things. > Given enough time, I'll check and see whether 7.x does run without trouble. > > If somebody thinks this thread should go to amd64, just say so. > But I am running the i386 stuff. > > dmesg and stuff in http://www.tegenbosch28.nl:/FreeBSD/toy > (although I see I have to fire up the system again to get a correct dmesg.) I forgot to mention that I also set a few sysctl values in /etc/sysctl.conf. You might try adding the following net.inet.tcp.sendspace=131072 net.inet.tcp.recvspace=131072 net.inet.udp.recvspace=65536 net.inet.raw.recvspace=16384 net.inet.tcp.path_mtu_discovery=0 net.inet.tcp.rexmit_min=30 net.inet.tcp.log_debug=0 # # WHen MSI was added to the kernel, I needed the following for # the Tyan motherboard that I have. Things may have improve, # but I haven't bothered to test. # hw.pci.enable_msix=0 hw.pci.enable_msi=0 -- Steve