From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 16 01:07:32 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5E77AAF3; Wed, 16 Jan 2013 01:07:32 +0000 (UTC) (envelope-from prvs=1728d5906c=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id A7690AA1; Wed, 16 Jan 2013 01:07:31 +0000 (UTC) Received: from r2d2 ([188.220.16.49]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50001720955.msg; Wed, 16 Jan 2013 01:07:30 +0000 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 16 Jan 2013 01:07:30 +0000 (not processed: message from valid local sender) X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=1728d5906c=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk Message-ID: From: "Steven Hartland" To: , "'Ian Lepore'" References: <09b701cdf367$12737530$375a5f90$@freebsd.org> <1358291098.32417.134.camel@revolution.hippie.lan> <0a0001cdf375$60ddbc40$229934c0$@freebsd.org> <0a2301cdf37d$ebe705a0$c3b510e0$@fisglobal.com> <1358296967.32417.137.camel@revolution.hippie.lan> <0a4601cdf384$4ff98e40$efecaac0$@freebsd.org> Subject: Re: kgzip(1) is broken Date: Wed, 16 Jan 2013 01:07:49 -0000 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Cc: freebsd-hackers@freebsd.org, dteske@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Jan 2013 01:07:32 -0000 ----- Original Message ----- From: To: "'Ian Lepore'" Cc: ; Sent: Wednesday, January 16, 2013 12:56 AM Subject: RE: kgzip(1) is broken > > >> -----Original Message----- >> From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org] >> Sent: Tuesday, January 15, 2013 4:43 PM >> To: Devin Teske >> Cc: dteske@freebsd.org; freebsd-hackers@freebsd.org >> Subject: RE: kgzip(1) is broken >> >> On Tue, 2013-01-15 at 16:10 -0800, Devin Teske wrote: >> > >> > > -----Original Message----- >> > > From: Devin Teske [mailto:devin.teske@fisglobal.com] On Behalf Of >> > > dteske@freebsd.org >> > > Sent: Tuesday, January 15, 2013 3:10 PM >> > > To: 'Ian Lepore' >> > > Cc: freebsd-hackers@freebsd.org; dteske@freebsd.org >> > > Subject: RE: kgzip(1) is broken >> > > >> > > >> > > >> > > > -----Original Message----- >> > > > From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org] >> > > > Sent: Tuesday, January 15, 2013 3:05 PM >> > > > To: dteske@freebsd.org >> > > > Cc: freebsd-hackers@freebsd.org >> > > > Subject: Re: kgzip(1) is broken >> > > > >> > > > On Tue, 2013-01-15 at 13:27 -0800, dteske@freebsd.org wrote: >> > > > > Hello, >> > > > > >> > > > > I have been sad of-late because kgzip(1) no longer produces a usable >> > kernel. >> > > > > >> > > > > All versions of 9.x suffer this. >> > > > > >> > > > > And somewhere between 8.3-RELEASE-p1 and 8.3-RELEASE-p5 this >> recently >> > > > broke in >> > > > > the 8.x series. >> > > > > >> > > > > I haven't tried the 7 series lately, but if whatever is making the > rounds >> > > gets >> > > > > MFC'd that far back, I expect the problem to percolate there too. >> > > > > >> > > > > The symptom is that the machine reboots immediately and unexpectedly >> the >> > > > moment >> > > > > the kernel is executed by the loader. >> > > > > >> > > > > This is quite troubling and I am looking for someone to help find the >> > > culprit. I >> > > > > don't know where to start looking. >> > > > >> > > > Here are some possible candidates from the things that were MFC'd to 8 >> > > > in that timeframe. I haven't looked at what these do, they're just >> > > > changes that affect files related to booting. >> > > > >> > > > r233211 >> > > > r233377 >> > > > r233469 >> > > > r234563 >> > > > >> > > >> > > Thanks Ian! >> > > >> > > I'll test each one individually to see if regressing any one (or all) >> > addresses >> > > the problem. >> > >> > Progress... >> > >> > Looks like I found the culprit. >> > >> > Turns out it's a back-ported bxe(4) driver (back-ported from 9 -- where > kgzip >> > seems to never work). >> > >> > I wonder why back-porting bxe(4) from stable/9 to releng/8.3 would cause >> kgzip >> > to produce non-working kernels. >> > >> >> Yeah, it'll be interesting to see how a device driver can lead to "the >> machine reboots immediately and unexpectedly the moment the kernel is >> executed by the loader," which I took to mean "before seeing the >> copyright or anything." >> > > Indeed... loader throws up the syms and upon execution *KABOOM* (screen goes > black and back to POST) > > The copyright never appears. > > >> > I'm emailing the maintainers (davidch + other Broadcom folk) >> > > The current dossier is even more interesting... the back-ported driver (with > zero modifications mind you from stable/9 to stable/8) exhibits memory failures > (example below), and causes terminals to become wedged when attempting to (for > example) scp a file over an existing configured network (igb-based -- presumably > unrelated to bxe but in practice loading bxe causes igb to misbehave). > > $ ifconfig bxe0 inet 192.168.1.5/24 > bxe0: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Cannot fill > fp[00] RX chain. > bxe0: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborting! > $ ifconfig bxe1 inet 192.168.1.6/24 > bxe1: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Cannot fill > fp[00] RX chain. > bxe1: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborting! > > (as expected, also sent mail off to maintainers w/respect to above notes/errors) Sounds like you may be out of mbufs which is easy, on a box with 4 igb's simply booting without tuning with cause this so, if you have igb's and bxe's this could be your cause. Try adding the following to loader.conf and see if it helps:- kern.ipc.nmbclusters=51200 Regards Steve ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk.