From owner-freebsd-stable@FreeBSD.ORG Thu Sep 22 03:26:46 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED510106564A; Thu, 22 Sep 2011 03:26:46 +0000 (UTC) (envelope-from leres@ee.lbl.gov) Received: from fun.ee.lbl.gov (fun.ee.lbl.gov [IPv6:2001:400:610:102::ca]) by mx1.freebsd.org (Postfix) with ESMTP id D81AC8FC13; Thu, 22 Sep 2011 03:26:46 +0000 (UTC) Received: from ice.ee.lbl.gov (ice.ee.lbl.gov [131.243.2.213]) (authenticated bits=0) by fun.ee.lbl.gov (8.14.5/8.14.5) with ESMTP id p8M3Qk24015821 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Wed, 21 Sep 2011 20:26:46 -0700 (PDT) Message-ID: <4E7AAAF6.7050004@ee.lbl.gov> Date: Wed, 21 Sep 2011 20:26:46 -0700 From: Craig Leres User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:6.0.2) Gecko/20110906 Thunderbird/6.0.2 MIME-Version: 1.0 To: freebsd-stable@freebsd.org X-Enigmail-Version: 1.4a1pre Content-Type: multipart/mixed; boundary="------------070803060606090506000303" Cc: John Baldwin Subject: Re: Panic during kernel booting on HP Proliant DL180G6 and latest STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Sep 2011 03:26:47 -0000 This is a multi-part message in MIME format. --------------070803060606090506000303 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit I have a lot of supermicro motherboards and the newest ones have igb chipsets; they've been quite a headache with respect to FreeBSD 8. I'm running 8.2-RELEASE but have upgraded parts of my kernel to 8-RELENG (as of a few months ago). Some of them work ok while others panic on bootup. Upgrading to newer versions of the intel igb code fixes some but breaks others. It's been frustrating. While working on this today, I saw two different kernel panics: Could not setup receive structures m_getzone: m_getjcl: invalid cluster type I tried John Baldwin's patch but got the "invalid cluster type" panic so I backed it out. Later I figured out that either turning off hw.igb.enable_msix (loader.conf) or raising kern.ipc.nmbclusters to 131072 (sysctl.conf) and setting hw.igb.num_queues to 4 (loader.conf) would avoid the "receive structures" panic but either way I was seeing the "invalid cluster type" panic. Looking m_getjcl(), I suspected the passed size to be 0; some debugging confirmed this. Looks like a race here where a receive interrupt comes in before adapter->rx_mbuf_sz has been initialized. Attached is the hack I added to avoid the panic when booting. The idea is to pretend m_getjcl() failed to allocate a cluster rather than to go down in flames. Craig --------------070803060606090506000303 Content-Type: text/plain; name="patch-if_igb.c.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="patch-if_igb.c.txt" Index: if_igb.c =================================================================== --- if_igb.c (revision 31) +++ if_igb.c (working copy) @@ -3695,6 +3695,11 @@ htole64(hseg[0].ds_addr); no_split: if (rxbuf->m_pack == NULL) { + if (adapter->rx_mbuf_sz == 0) { + printf("igb_refresh_mbufs: " + "avoid m_getjcl() panic\n"); + goto update; + } mp = m_getjcl(M_DONTWAIT, MT_DATA, M_PKTHDR, adapter->rx_mbuf_sz); if (mp == NULL) @@ -3912,6 +3917,12 @@ skip_head: /* Now the payload cluster */ + if (adapter->rx_mbuf_sz == 0) { + printf("igb_setup_receive_ring: " + "avoid m_getjcl() panic\n"); + error = ENOBUFS; + goto fail; + } rxbuf->m_pack = m_getjcl(M_DONTWAIT, MT_DATA, M_PKTHDR, adapter->rx_mbuf_sz); if (rxbuf->m_pack == NULL) { --------------070803060606090506000303--