Date: Fri, 24 Sep 2010 12:44:32 -0500 From: Tom Judge <tom@tomjudge.com> To: David Christensen <davidch@broadcom.com> Cc: "pyunyh@gmail.com" <pyunyh@gmail.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "yongari@freebsd.org" <yongari@freebsd.org> Subject: Re: bce(4) - com_no_buffers (Again) Message-ID: <4C9CE380.6020906@tomjudge.com> In-Reply-To: <4C9BABA4.1060805@tomjudge.com> References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com> <20100913184833.GF1229@michelle.cdnetworks.com> <4C8E768E.7000003@tomjudge.com> <20100913193322.GG1229@michelle.cdnetworks.com> <4C8E8BD1.5090007@tomjudge.com> <20100913205348.GJ1229@michelle.cdnetworks.com> <4C9B6CBD.2030408@tomjudge.com> <5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com> <4C9BA9FD.50406@tomjudge.com> <4C9BABA4.1060805@tomjudge.com>
next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format. --------------000604030004000001090500 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 09/23/2010 02:33 PM, Tom Judge wrote: > The throttle command I am using in the tests is the one from here: > > http://klicman.org/throttle/ > > > On 09/23/2010 02:26 PM, Tom Judge wrote: > >> On 09/23/2010 01:21 PM, David Christensen wrote: >> >> >>>>>> Under testing I have yet to see a memory fragmentation issue with >>>>>> >>>>>> >>>>>> >>>> this >>>> >>>> >>>> >>>>>> driver. I follow up if/when I find a problem with this again. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> So here we are again. The system is locking up again because of 9k >>>> mbuf >>>> allocation failures. >>>> >>>> >>>> >>> Failure to allocate a new buffer should cause the driver to >>> drop the received frame and reuse the buffer, not lock up the >>> system. Are you seeing the lockup come from bce(4) or does >>> it come from somewhere else due to the dropped data? >>> >>> >>> >>> >> The lockup is not from the NIC as such, the systems have the appearance >> of locking up as home directories are on NFS and the user information is >> stored in a remote LDAP server. When the system starts to drop frames >> due to lack of 9k memory regions it tends to last for a few minutes >> (when it is really bad) and stop all traffic into the system. This >> appears to the average user as a complete system pause. >> >> >> >> >>>>>> Is there a way to fix the RX buffer shortage issues (when header >>>>>> splitting is turned on) so that they are guarded by flow control. >>>>>> >>>>>> >>>>>> >>>> Maybe >>>> >>>> >>>> >>>>>> change the low watermark for flow control when its enabled? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> I'm not sure how much it would help but try changing RX low >>>>> watermark. Default value is 32 which seems to be reasonable value. >>>>> But it's only for 5709/5716 controllers and Linux seems to use >>>>> different default value. >>>>> >>>>> >>>>> >>>>> >>>> These are: NetXtreme II BCM5709 Gigabit Ethernet >>>> >>>> So my next task is to turn the watermark related defines into sysctls >>>> and turn on header splitting so that I can try to tune them without >>>> having to reboot. >>>> >>>> >>>> >>>> >>> Do you have flow control enabled? There are arguments both for >>> and against flow control. For bce(4), I haven't tested flow control >>> for quite a while and it's behavior may have changed since it is >>> controlled by firmware. Keep an eye on the hardware statistics >>> to see that's it's actively generating pause frames. >>> >>> >>> >> 3) With flow control enabled and header splitting on flood the server >> with very small frames (200 bytes). (Using the same test as in case 1). >> My aim is to tune the watermark here so that there are no frames dropped >> due to BD shortages. >> >> Card info unhidden: bce0: ASIC (0x57092003); Rev (C0); Bus (PCIe x4, 2.5Gbps); B/C (5.2.2); Flags (SPLT|MSI|MFW); MFW (NCSI 2.0.8) So having done lots of testing with flow control turned on as well as header splitting it seems like flow control may be broken with header splitting? I have been using the patch attached to play with the flow control water marks. I have tried with with following data points and am finding it difficult to get flow control to kick in before the card runs out of descriptors and starts dropping frames: low: 16 high: 127 low: 32 high: 127 low: 64 high: 127 low: 96 high: 127 low: 32 high: 196 low: 64 high: 196 low: 128 high: 256 None of these seem to have any noticeable or effect on the drop rate or the number of dev.bce.0.stat_FlowControlDone's in the sample period. Thoughs? Tom -- TJU13-ARIN --------------000604030004000001090500 Content-Type: text/plain; name="if_bce.patch.txt" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="if_bce.patch.txt" Index: if_bce.c =================================================================== --- if_bce.c (revision 949) +++ if_bce.c (working copy) @@ -511,6 +511,21 @@ SYSCTL_UINT(_hw_bce, OID_AUTO, msi_enable, CTLFLAG_RDTUN, &bce_msi_enable, 0, "MSI-X|MSI|INTx selector"); + +/* Tunable RX flow control low water mark. */ +/* Without header splitting the default is 32 */ +static int bce_rx_low_water_mark = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT; +TUNABLE_INT("hw.bce.rx_low_water_mark", &bce_rx_low_water_mark); +SYSCTL_UINT(_hw_bce, OID_AUTO, rx_low_water_mark, CTLFLAG_RDTUN, &bce_rx_low_water_mark, 0, +"Default RX Flow Control Low Water Mark"); + +/* Tunable RX flow control high water mark. */ +/* Without header splitting the default is 32 */ +static int bce_rx_high_water_mark = USABLE_RX_BD / 4; +TUNABLE_INT("hw.bce.rx_high_water_mark", &bce_rx_high_water_mark); +SYSCTL_UINT(_hw_bce, OID_AUTO, rx_high_water_mark, CTLFLAG_RDTUN, &bce_rx_high_water_mark, 0, +"Default RX Flow Control High Water Mark"); + /* ToDo: Add tunable to enable/disable strict MTU handling. */ /* Currently allows "loose" RX MTU checking (i.e. sets the */ /* H/W RX MTU to the size of the largest receive buffer, or */ @@ -1780,11 +1795,15 @@ } if (mii->mii_media_active & IFM_FLAG1) { + BCE_PRINTF("%s(%d): Enabling TX flow control.\n", + __FILE__, __LINE__); DBPRINT(sc, BCE_INFO_PHY, "%s(): Enabling TX flow control.\n", __FUNCTION__); BCE_SETBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN); sc->bce_flags |= BCE_USING_TX_FLOW_CONTROL; } else { + BCE_PRINTF("%s(%d): Disabling TX flow control.\n", + __FILE__, __LINE__); DBPRINT(sc, BCE_INFO_PHY, "%s(): Disabling TX flow control.\n", __FUNCTION__); BCE_CLRBIT(sc, BCE_EMAC_TX_MODE, BCE_EMAC_TX_MODE_FLOW_EN); @@ -5414,7 +5433,7 @@ u32 lo_water, hi_water; if (sc->bce_flags && BCE_USING_TX_FLOW_CONTROL) { - lo_water = BCE_L2CTX_RX_LO_WATER_MARK_DEFAULT; + lo_water = bce_rx_low_water_mark; } else { lo_water = 0; } @@ -5423,11 +5442,12 @@ lo_water = 0; } - hi_water = USABLE_RX_BD / 4; + hi_water = bce_rx_high_water_mark; if (hi_water <= lo_water) { lo_water = 0; } + BCE_PRINTF("Setting Up Flow Control (Pre Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water); lo_water /= BCE_L2CTX_RX_LO_WATER_MARK_SCALE; hi_water /= BCE_L2CTX_RX_HI_WATER_MARK_SCALE; @@ -5436,7 +5456,8 @@ hi_water = 0xf; else if (hi_water == 0) lo_water = 0; - + + BCE_PRINTF("Setting Up Flow Control (Post Scaling), Low Watermark: %d, High Watermark: %d\n", (int)lo_water, (int)hi_water); val |= (lo_water << BCE_L2CTX_RX_LO_WATER_MARK_SHIFT) | (hi_water << BCE_L2CTX_RX_HI_WATER_MARK_SHIFT); } --------------000604030004000001090500--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C9CE380.6020906>