Date: Thu, 22 Sep 2011 10:31:05 -0700 From: "Vogel, Jack" <jack.vogel@intel.com> To: Jeremy Chadwick <freebsd@jdc.parodius.com>, David G Lawrence <dg@dglawrence.com> Cc: "freebsd-stable@freebsd.org" <freebsd-stable@freebsd.org>, John Baldwin <jhb@freebsd.org>, Craig Leres <leres@ee.lbl.gov> Subject: RE: Panic during kernel booting on HP Proliant DL180G6 and latest STABLE Message-ID: <1DB50624F8348F48840F2E2CF6040A9D01986290C5@orsmsx508.amr.corp.intel.com> In-Reply-To: <20110922102732.GA60730@icarus.home.lan> References: <4E7AAAF6.7050004@ee.lbl.gov> <20110922101156.GA11465@black.dglawrence.com> <20110922102732.GA60730@icarus.home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
[-- Attachment #1 --] -----Original Message----- From: Jeremy Chadwick [mailto:freebsd@jdc.parodius.com] Sent: Thursday, September 22, 2011 3:28 AM To: David G Lawrence Cc: Craig Leres; freebsd-stable@freebsd.org; John Baldwin; Vogel, Jack Subject: Re: Panic during kernel booting on HP Proliant DL180G6 and latest STABLE On Thu, Sep 22, 2011 at 03:11:56AM -0700, David G Lawrence wrote: > > I have a lot of supermicro motherboards and the newest ones have igb > > chipsets; they've been quite a headache with respect to FreeBSD 8. I'm > > running 8.2-RELEASE but have upgraded parts of my kernel to 8-RELENG (as > > of a few months ago). Some of them work ok while others panic on bootup. > > Upgrading to newer versions of the intel igb code fixes some but breaks > > others. It's been frustrating. > > > > While working on this today, I saw two different kernel panics: > > > > Could not setup receive structures > > m_getzone: m_getjcl: invalid cluster type > > I fixed this awhile back in my local sources. A 12 core Supermicro > MB system I'm building here was hitting the bug 100% of the time during > startup. Patch attached. > > -DG > > Dr. David G. Lawrence > President > Download Technologies, Inc. - http://www.downloadtech.com - (866) 399 8500 > Pave the road of life with opportunities. > > Index: if_igb.c > =================================================================== > RCS file: /home/ncvs/src/sys/dev/e1000/if_igb.c,v > retrieving revision 1.21.2.20 > diff -c -r1.21.2.20 if_igb.c > *** if_igb.c 29 Jun 2011 16:16:59 -0000 1.21.2.20 > --- if_igb.c 22 Sep 2011 10:04:31 -0000 > *************** > *** 1278,1286 **** > /* Don't lose promiscuous settings */ > igb_set_promisc(adapter); > > - ifp->if_drv_flags |= IFF_DRV_RUNNING; > - ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; > - > callout_reset(&adapter->timer, hz, igb_local_timer, adapter); > e1000_clear_hw_cntrs_base_generic(&adapter->hw); > > --- 1278,1283 ---- > *************** > *** 1308,1313 **** > --- 1305,1313 ---- > > /* Don't reset the phy next time init gets called */ > adapter->hw.phy.reset_disable = TRUE; > + > + ifp->if_drv_flags |= IFF_DRV_RUNNING; > + ifp->if_drv_flags &= ~IFF_DRV_OACTIVE; > } > > static void > *************** > *** 1490,1501 **** > E1000_WRITE_REG(&adapter->hw, E1000_EIMC, que->eims); > ++que->irqs; > > IGB_TX_LOCK(txr); > more_tx = igb_txeof(txr); > IGB_TX_UNLOCK(txr); > > - more_rx = igb_rxeof(que, adapter->rx_process_limit, NULL); > - > if (igb_enable_aim == FALSE) > goto no_calc; > /* > --- 1490,1505 ---- > E1000_WRITE_REG(&adapter->hw, E1000_EIMC, que->eims); > ++que->irqs; > > + if (!(adapter->ifp->if_drv_flags & IFF_DRV_RUNNING)) { > + return; > + } > + > + more_rx = igb_rxeof(que, adapter->rx_process_limit, NULL); > + > IGB_TX_LOCK(txr); > more_tx = igb_txeof(txr); > IGB_TX_UNLOCK(txr); > > if (igb_enable_aim == FALSE) > goto no_calc; > /* CC'ing Jack Vogel. Jack, any insights with regards to this patch? This also touches on what Adrian was mentioning as well, at least to some degree. I have a slight modification to John's earlier changes, namely, it masks the full EIMC register when setup to use MSIX, I'm thinking this might be why the earlier patch failed? The code in this latest email is something that I would not want to use if possible, it's just trying to avoid the problem. Please test with this change instead. Regards, Jack [-- Attachment #2 --] --- if_igb.c 2011-04-28 08:28:59.000000000 -0700 +++ if_igb.jfv.c 2011-09-22 01:58:49.000000000 -0700 @@ -99,7 +99,7 @@ /********************************************************************* * Driver version: *********************************************************************/ -char igb_driver_version[] = "version - 2.2.3"; +char igb_driver_version[] = "version - 2.2.3 - test"; /********************************************************************* @@ -598,16 +598,6 @@ goto err_late; } - /* - ** Configure Interrupts - */ - if ((adapter->msix > 1) && (igb_enable_msix)) - error = igb_allocate_msix(adapter); - else /* MSI or Legacy */ - error = igb_allocate_legacy(adapter); - if (error) - goto err_late; - /* Setup OS specific network interface */ if (igb_setup_interface(dev, adapter) != 0) goto err_late; @@ -651,6 +641,16 @@ adapter->led_dev = led_create(igb_led_func, adapter, device_get_nameunit(dev)); + /* + ** Configure Interrupts + */ + if ((adapter->msix > 1) && (igb_enable_msix)) + error = igb_allocate_msix(adapter); + else /* MSI or Legacy */ + error = igb_allocate_legacy(adapter); + if (error) + goto err_late; + INIT_DEBUGOUT("igb_attach: end"); return (0); @@ -659,10 +659,10 @@ igb_free_transmit_structures(adapter); igb_free_receive_structures(adapter); igb_release_hw_control(adapter); - if (adapter->ifp != NULL) - if_free(adapter->ifp); err_pci: igb_free_pci_resources(adapter); + if (adapter->ifp != NULL) + if_free(adapter->ifp); free(adapter->mta, M_DEVBUF); IGB_CORE_LOCK_DESTROY(adapter); @@ -2167,6 +2167,9 @@ adapter->msix = igb_setup_msix(adapter); adapter->hw.back = &adapter->osdep; + /* Make sure no interrupts come in early */ + igb_disable_intr(adapter); + return (0); }
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1DB50624F8348F48840F2E2CF6040A9D01986290C5>
