Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 20 Oct 2008 09:13:30 -0700
From:      Steve Kargl <sgk@troutmask.apl.washington.edu>
To:        Willem Jan Withagen <wjw@withagen.nl>
Cc:        current@freebsd.org
Subject:   Re: SMP opteron system freezes, Was: Re: Freezing or stalling current system
Message-ID:  <20081020161330.GA58868@troutmask.apl.washington.edu>
In-Reply-To: <48FCA4E4.4070508@withagen.nl>
References:  <48F90FC1.3040503@digiware.nl> <20081018002133.GA36113@troutmask.apl.washington.edu> <48FBB431.4090102@digiware.nl> <48FCA4E4.4070508@withagen.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Oct 20, 2008 at 05:33:56PM +0200, Willem Jan Withagen wrote:
> Willem Jan Withagen wrote:
> >Steve Kargl wrote:
> >>On Sat, Oct 18, 2008 at 12:20:49AM +0200, Willem Jan Withagen wrote:
> >>>I'm sort of assuming that the bge0: timeouts and coalesced links are 
> >>>due to the freezing.
> >>
> >>Does the following help?
> >>
> >
> >Just a little...
> >It now takes a little longer for the system to freeze, but eventally it 
> >will.
> >The coalesced messages did not return.
> >
> >Just out of curiosity is also plugged in an fxp-card.
> >And there it takes even longer for the system to freeze, but in the end 
> >it does freeze.
> >
> >The "funny" part is it once in a while is revivable by going into the 
> >kernel-debugger and then just continue.
> >Sometimes a long wait (10 sec) will suffice, during which there is no 
> >keyboard response what so ever.
> >But on other instances the system is dead in the water, and only a 
> >hardware reset gets it back.
> >
> >Something I'm still wondering if this only is with NFS traffic, or with 
> >all other types of network traffic. But I haven't tested thids.
> 
> Well I tested something different.
> 
> This is a (older) dual opteron 244 system. So each chip has only one core.
> And I removed one of the processors...
> 
> Guess what:
> 	It just runs without any problems as far as I could test.
> 
> With 2 processors it is just enough to let init start all the nfs related 
> stuff in /etc/rc.d and lock up the system.
> 
> So I guess we need to look at totally different things.
> Given enough time, I'll check and see whether 7.x does run without trouble.
> 
> If somebody thinks this thread should go to amd64, just say so.
> But I am running the i386 stuff.
> 
> dmesg and stuff in http://www.tegenbosch28.nl:/FreeBSD/toy
> (although I see I have to fire up the system again to get a correct dmesg.)

I forgot to mention that I also set a few sysctl values in
/etc/sysctl.conf.  You might try adding the following

net.inet.tcp.sendspace=131072
net.inet.tcp.recvspace=131072
net.inet.udp.recvspace=65536
net.inet.raw.recvspace=16384
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.rexmit_min=30
net.inet.tcp.log_debug=0
#
# WHen MSI was added to the kernel, I needed the following for
# the Tyan motherboard that I have.  Things may have improve,
# but I haven't bothered to test.
#
hw.pci.enable_msix=0
hw.pci.enable_msi=0

-- 
Steve



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20081020161330.GA58868>