Date: Wed, 16 Jan 2013 11:04:54 -0800 From: Devin Teske <devin.teske@fisglobal.com> To: Steven Hartland <killing@multiplay.co.uk> Cc: 'Ian Lepore' <freebsd@damnhippie.dyndns.org>, dteske@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: kgzip(1) is broken Message-ID: <23AAEBCB-6438-42EB-9B2E-E657CFC3BA1B@fisglobal.com> In-Reply-To: <B22DF1755E60453F939EB6D7A7361622@multiplay.co.uk> References: <09b701cdf367$12737530$375a5f90$@freebsd.org> <1358291098.32417.134.camel@revolution.hippie.lan> <0a0001cdf375$60ddbc40$229934c0$@freebsd.org> <0a2301cdf37d$ebe705a0$c3b510e0$@fisglobal.com> <1358296967.32417.137.camel@revolution.hippie.lan> <0a4601cdf384$4ff98e40$efecaac0$@freebsd.org> <B22DF1755E60453F939EB6D7A7361622@multiplay.co.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Jan 15, 2013, at 5:07 PM, Steven Hartland wrote: >=20 > ----- Original Message ----- From: <dteske@freebsd.org> > To: "'Ian Lepore'" <freebsd@damnhippie.dyndns.org> > Cc: <freebsd-hackers@freebsd.org>; <dteske@freebsd.org> > Sent: Wednesday, January 16, 2013 12:56 AM > Subject: RE: kgzip(1) is broken >=20 >=20 >>> -----Original Message----- >>> From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org] >>> Sent: Tuesday, January 15, 2013 4:43 PM >>> To: Devin Teske >>> Cc: dteske@freebsd.org; freebsd-hackers@freebsd.org >>> Subject: RE: kgzip(1) is broken >>> On Tue, 2013-01-15 at 16:10 -0800, Devin Teske wrote: >>>>=20 >>>>> -----Original Message----- >>>>> From: Devin Teske [mailto:devin.teske@fisglobal.com] On Behalf Of >>>>> dteske@freebsd.org >>>>> Sent: Tuesday, January 15, 2013 3:10 PM >>>>> To: 'Ian Lepore' >>>>> Cc: freebsd-hackers@freebsd.org; dteske@freebsd.org >>>>> Subject: RE: kgzip(1) is broken >>>>>=20 >>>>>=20 >>>>>=20 >>>>>> -----Original Message----- >>>>>> From: Ian Lepore [mailto:freebsd@damnhippie.dyndns.org] >>>>>> Sent: Tuesday, January 15, 2013 3:05 PM >>>>>> To: dteske@freebsd.org >>>>>> Cc: freebsd-hackers@freebsd.org >>>>>> Subject: Re: kgzip(1) is broken >>>>>>=20 >>>>>> On Tue, 2013-01-15 at 13:27 -0800, dteske@freebsd.org wrote: >>>>>>> Hello, >>>>>>>=20 >>>>>>> I have been sad of-late because kgzip(1) no longer produces a usable >>>> kernel. >>>>>>>=20 >>>>>>> All versions of 9.x suffer this. >>>>>>>=20 >>>>>>> And somewhere between 8.3-RELEASE-p1 and 8.3-RELEASE-p5 this >>> recently >>>>>> broke in >>>>>>> the 8.x series. >>>>>>>=20 >>>>>>> I haven't tried the 7 series lately, but if whatever is making the >> rounds >>>>> gets >>>>>>> MFC'd that far back, I expect the problem to percolate there too. >>>>>>>=20 >>>>>>> The symptom is that the machine reboots immediately and unexpectedly >>> the >>>>>> moment >>>>>>> the kernel is executed by the loader. >>>>>>>=20 >>>>>>> This is quite troubling and I am looking for someone to help find t= he >>>>> culprit. I >>>>>>> don't know where to start looking. >>>>>>=20 >>>>>> Here are some possible candidates from the things that were MFC'd to= 8 >>>>>> in that timeframe. I haven't looked at what these do, they're just >>>>>> changes that affect files related to booting. >>>>>>=20 >>>>>> r233211 >>>>>> r233377 >>>>>> r233469 >>>>>> r234563 >>>>>>=20 >>>>>=20 >>>>> Thanks Ian! >>>>>=20 >>>>> I'll test each one individually to see if regressing any one (or all) >>>> addresses >>>>> the problem. >>>>=20 >>>> Progress... >>>>=20 >>>> Looks like I found the culprit. >>>>=20 >>>> Turns out it's a back-ported bxe(4) driver (back-ported from 9 -- where >> kgzip >>>> seems to never work). >>>>=20 >>>> I wonder why back-porting bxe(4) from stable/9 to releng/8.3 would cau= se >>> kgzip >>>> to produce non-working kernels. >>>>=20 >>> Yeah, it'll be interesting to see how a device driver can lead to "the >>> machine reboots immediately and unexpectedly the moment the kernel is >>> executed by the loader," which I took to mean "before seeing the >>> copyright or anything." >> Indeed... loader throws up the syms and upon execution *KABOOM* (screen = goes >> black and back to POST) >> The copyright never appears. >>>> I'm emailing the maintainers (davidch + other Broadcom folk) >> The current dossier is even more interesting... the back-ported driver (= with >> zero modifications mind you from stable/9 to stable/8) exhibits memory f= ailures >> (example below), and causes terminals to become wedged when attempting t= o (for >> example) scp a file over an existing configured network (igb-based -- pr= esumably >> unrelated to bxe but in practice loading bxe causes igb to misbehave). >> $ ifconfig bxe0 inet 192.168.1.5/24 >> bxe0: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Canno= t fill >> fp[00] RX chain. >> bxe0: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborti= ng! >> $ ifconfig bxe1 inet 192.168.1.6/24 >> bxe1: ../../../dev/bxe/if_bxe.c(10939): Memory allocation failure! Canno= t fill >> fp[00] RX chain. >> bxe1: ../../../dev/bxe/if_bxe.c(3921): NIC initialization failed, aborti= ng! >> (as expected, also sent mail off to maintainers w/respect to above notes= /errors) >=20 > Sounds like you may be out of mbufs which is easy, on a box with 4 igb's = simply > booting without tuning with cause this so, if you have igb's and bxe's th= is > could be your cause. >=20 > Try adding the following to loader.conf and see if it helps:- > kern.ipc.nmbclusters=3D51200 >=20 Sorry for delayed response -- we had to go through a power cycle. I haven't yet tried bumping the value as suggested, but I suspect it will i= ndeed help greatly -- I noticed that I got 18% into the scp before things t= ook a dive for the worse (hanging terminals and such). Another thing worth noting about the uplifted bxe(4) plopped into RELENG_8= =85 when we rebooted: bxe0: ../../../dev/bxe/if_bxe.c(6419): Slowpath queue is full! bxe0: ---------- Begin crash dump ---------- bxe0: ---------- End crash dump ---------- bxe0: ../../../dev/bxe/if_bxe.c(6419): Slowpath queue is full! bxe0: ---------- Begin crash dump ---------- bxe0: ---------- End crash dump ---------- bxe0: ../../../dev/bxe/if_bxe.c(3262): fp[01] client ramrod halt failed! Heh. The machine had to be hard cycled. --=20 Devin _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?23AAEBCB-6438-42EB-9B2E-E657CFC3BA1B>