From owner-freebsd-amd64@FreeBSD.ORG Fri Mar 11 11:37:10 2005 Return-Path: Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 939C716A4CE for ; Fri, 11 Mar 2005 11:37:10 +0000 (GMT) Received: from buxton.digitalspy.co.uk (buxton.digitalspy.co.uk [212.42.1.208]) by mx1.FreeBSD.org (Postfix) with ESMTP id E3C6743D48 for ; Fri, 11 Mar 2005 11:37:09 +0000 (GMT) (envelope-from alan_jay_uk@yahoo.co.uk) Received: from AJDELL9200 (213-78-6-149.uk.onetel.net.uk [213.78.6.149]) by buxton.digitalspy.co.uk (Postfix) with ESMTP id 97A8A54821; Fri, 11 Mar 2005 11:37:09 +0000 (GMT) From: "Alan Jay" To: "'Doug White'" Date: Fri, 11 Mar 2005 11:37:08 -0000 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook, Build 11.0.6353 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 In-Reply-To: <20050310230725.D64217@carver.gumbysoft.com> Thread-Index: AcUmCYIl8qHLKXCqQS+cgzz8EZLHLwAIRmMg Message-Id: <20050311113709.97A8A54821@buxton.digitalspy.co.uk> cc: freebsd-amd64@freebsd.org Subject: RE: BroadcomBCM5704C 10/100/1000 on TyanThunder K8S pro S2882 twin[Alan Jay] Operteron X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2005 11:37:10 -0000 > -----Original Message----- > From: Doug White [mailto:dwhite@gumbysoft.com] > On Thu, 10 Mar 2005, Alan Jay wrote: > > > > From: Doug White > > > > > > On Mon, 7 Mar 2005, Alan Jay wrote: > > > > > > > Well after upgrading to the latest -STABLE via cvsup and makeworld > > > makekernel > > > > etc we have been doing some more tests over the weekend. > > > > > > When did you run this cvsup? > > > > [Alan Jay] March 2nd. > > Being that its been a week you might give this another spin. [Alan Jay] OK will do so - things are moving that fast are they. Being a newbie at this kind of thing for a small time period like this do I need to do a make world and make kernel and follow the full list of things to do or can I get away with just a new kernel? > > > > around 6 of the 8Gb of RAM the server then logged: > > > > > > > > Mar 7 07:42:47 flappy kernel: bge1: discard frame w/o leading > ethernet > > > header > > > > (len 4294967292 pkt len 4294967292) > > > > > > Hm, unsigned -1. That message is printed by ether_input() if it get > > > handed a bum mbuf. > > > > > > > Followed by: > > > > > > > > Mar 7 07:42:47 flappy kernel: Fatal trap 12: pag > > > > > > Unfortunately this is not useful. We need the entire panic messsage and > > > ideally a backtrace and crashdump. Can you connect a serial console to > > > this system and log the output? > > > > [Alan Jay] We have done that but the serial terminal is attached to a > terminal > > concentrator and it seems to timeout before logging any useful > information. > > When we succeeded there was nothing on the serial console in the way of a > > panic message. Sorry not sure how to do a backtrace or crashdump? > > See the section on kernel debugging in the Developer's Handbook. You > activate crashdumps by nominating a partition that is at least as large as > memory with the 'dumpdev' rc.conf variable (and can be enabled at runtime > with the 'dumpon' command). Once the machine panics and creates the > crashdump, on the ensuing reboot savecore will automatically run and > extract the crashdump. With the crashdump in hand you can use kgdb and a > debugging kernel image to figure out what happened. [Alan Jay] Thanks will add this in and look at the developers handbook. > > > > Subsequently to that it has crashed a number of times and on a couple > of > > > > occasions has reported: > > > > > > > > kernel: fxp0: can't map mbuf (error 12) > > > > > > Error 12 is ENOMEM and thats coming from bus_dmamap_load_mbuf(). That > can > > > be returned if you're running out of space for bounce buffers, or kmem > in > > > general. scottl has been working on busdma issues in HEAD and recently > > > committed a fix for i386 for bounce page allocation issues. > > > > > > kmem depletion would be more insidious. Have you been getting other > > > message that indicates failure to allocate memory or error 12? > > > > [Alan Jay] I had seen them before on the console several times. > > Hm, then kmem depletion may be in play. Unforutnately I've not tuned kmem > on amd64 so I don't know if the same variables on i386 apply. [Alan Jay] OK thanks. > > > > By the way over the weekend the latest -STABLE which is marked 5.4- > > > PRERELEASE > > > > 2 seemed much better than 5.3 had and the initial problems took much > > > longer to > > > > appear. Though once the problems started to appear, they repeated > > > themselves > > > > rebooting every 1-2hrs until we removed the tests data. > > > > > > That behavior sounds a lot like thermal issues. It takes a while to > warm > > > up to the critcal point and once it hits that point it really starts to > > > malfunction. Unless the test run starts out slow or something. > > > > [Alan Jay] Unlikely as the servers have been on 24hrs a day since we got > them > > in a rack at a data centre so the temperature should be reasonable > consistent. > > Right, but a failed fan keeps that nice cool air from getting to the > burning hot parts. :) [Alan Jay] Indeed that is true and I will check the fans when I am next in but it is relatively low down my list of potential problems especially as we have seen similar problems on both servers and it only happens we a certain test is done all the others are fine. But I never rule anything out. Thanks for all the input it has been very useful. Alan