From owner-freebsd-current@FreeBSD.ORG Tue Nov 3 15:43:37 2009 Return-Path: Delivered-To: freebsd-current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 645E61065679 for ; Tue, 3 Nov 2009 15:43:37 +0000 (UTC) (envelope-from weldon@excelsusphoto.com) Received: from mx0.excelsus.net (emmett.excelsus.com [74.93.113.252]) by mx1.freebsd.org (Postfix) with ESMTP id 047738FC16 for ; Tue, 3 Nov 2009 15:43:36 +0000 (UTC) Received: (qmail 57975 invoked by uid 89); 3 Nov 2009 15:43:34 -0000 Received: from unknown (HELO localhost) (127.0.0.1) by localhost.excelsus.com with SMTP; 3 Nov 2009 15:43:34 -0000 Date: Tue, 3 Nov 2009 10:43:34 -0500 (EST) From: Weldon S Godfrey 3 X-X-Sender: weldon@emmett.excelsus.com To: Gavin Atkinson In-Reply-To: <1257261214.98619.92.camel@buffy.york.ac.uk> Message-ID: References: <1257185816.44755.29.camel@buffy.york.ac.uk> <1257261214.98619.92.camel@buffy.york.ac.uk> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Tue, 03 Nov 2009 16:49:08 +0000 Cc: freebsd-current@FreeBSD.org Subject: Re: FreeBSD 8.0 - network stack crashes? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 03 Nov 2009 15:43:37 -0000 If memory serves me right, sometime around 3:13pm, Gavin Atkinson told me: > OK, at least we've figured out what is going wrong then. As a > workaround to get the machine to stay up longer, you should be able to > set kern.ipc.nmbclusters=256000 in /boot/loader.conf -but hopefully we > can resolve this soon. > > Firstly, what kernel was the above output from? And what network card > are you using? In your initial post you mentioned testing both bce(4) > and em(4) cards, be aware that em(4) had an issue that would cause > exactly this issue, which was fixed with a commit on September 11th > (r197093). Make sure your kernel is from after that date if you are > using em(4). I guess it is also possible that bce(4) has the same > issue, I'm not aware of any fixes to it recently. > > So, from here, I think the best thing would be to just use the em(4) NIC > and an up-to-date kernel, and see if you can reproduce the issue. > > How important is this machine? If em(4) works, are you able to help > debug the issues with the bce(4) driver? > > Thanks, > > Gavin > we used the em card only a few times, but each time we used it, the problem happened so we have been staying with the on board nics using the bce driver. Would leaving in the em card cause any issues, even if it isn't up? This output was from a kernel on 12/08. The issue really came up while we tried to swap to 8.0-RC2. We plan to swap back sometime in the near future. The same symptoms happened with RC2 so I am sure it is a kmem exhaustion. I am guessing v3 requires a lot more. When we switch, i'll change to using the em card. This machine is very important. I could set up an additional machine, but I don't have the ability to simulate the load nor have the large drive array attached. Thanks! Weldon