Date: Thu, 23 Sep 2010 14:33:56 -0500 From: Tom Judge <tom@tomjudge.com> To: David Christensen <davidch@broadcom.com> Cc: "pyunyh@gmail.com" <pyunyh@gmail.com>, "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>, "yongari@freebsd.org" <yongari@freebsd.org> Subject: Re: bce(4) - com_no_buffers (Again) Message-ID: <4C9BABA4.1060805@tomjudge.com> In-Reply-To: <4C9BA9FD.50406@tomjudge.com> References: <4C894A76.5040200@tomjudge.com> <20100910002439.GO7203@michelle.cdnetworks.com> <4C8E3D79.6090102@tomjudge.com> <20100913184833.GF1229@michelle.cdnetworks.com> <4C8E768E.7000003@tomjudge.com> <20100913193322.GG1229@michelle.cdnetworks.com> <4C8E8BD1.5090007@tomjudge.com> <20100913205348.GJ1229@michelle.cdnetworks.com> <4C9B6CBD.2030408@tomjudge.com> <5D267A3F22FD854F8F48B3D2B52381933B5A78B484@IRVEXCHCCR01.corp.ad.broadcom.com> <4C9BA9FD.50406@tomjudge.com>
next in thread | previous in thread | raw e-mail | index | archive | help
The throttle command I am using in the tests is the one from here: http://klicman.org/throttle/ On 09/23/2010 02:26 PM, Tom Judge wrote: > On 09/23/2010 01:21 PM, David Christensen wrote: > >>>>> Under testing I have yet to see a memory fragmentation issue with >>>>> >>>>> >>> this >>> >>> >>>>> driver. I follow up if/when I find a problem with this again. >>>>> >>>>> >>>>> >>>>> >>> So here we are again. The system is locking up again because of 9k >>> mbuf >>> allocation failures. >>> >>> >> Failure to allocate a new buffer should cause the driver to >> drop the received frame and reuse the buffer, not lock up the >> system. Are you seeing the lockup come from bce(4) or does >> it come from somewhere else due to the dropped data? >> >> >> > The lockup is not from the NIC as such, the systems have the appearance > of locking up as home directories are on NFS and the user information is > stored in a remote LDAP server. When the system starts to drop frames > due to lack of 9k memory regions it tends to last for a few minutes > (when it is really bad) and stop all traffic into the system. This > appears to the average user as a complete system pause. > > > >>>>> Is there a way to fix the RX buffer shortage issues (when header >>>>> splitting is turned on) so that they are guarded by flow control. >>>>> >>>>> >>> Maybe >>> >>> >>>>> change the low watermark for flow control when its enabled? >>>>> >>>>> >>>>> >>>>> >>>> I'm not sure how much it would help but try changing RX low >>>> watermark. Default value is 32 which seems to be reasonable value. >>>> But it's only for 5709/5716 controllers and Linux seems to use >>>> different default value. >>>> >>>> >>>> >>> These are: NetXtreme II BCM5709 Gigabit Ethernet >>> >>> So my next task is to turn the watermark related defines into sysctls >>> and turn on header splitting so that I can try to tune them without >>> having to reboot. >>> >>> >>> >> Do you have flow control enabled? There are arguments both for >> and against flow control. For bce(4), I haven't tested flow control >> for quite a while and it's behavior may have changed since it is >> controlled by firmware. Keep an eye on the hardware statistics >> to see that's it's actively generating pause frames. >> >> > At the moment I have a number tests: > > 1) With flow control disabled and header splitting on or off flood the > server with very small frames (200 bytes). This will trigger the > firmware to drop frames due to BD shortages (incrementing > dev.bce.X.com_no_buffers). > > Traffic source: > > route change test-system -mtu 200 > dd if=/dev/zero bs=8000 | nc -l 1111 > > Test system: > > nc source 1111 > /dev/null > > > 2) With flow control enabled and header splitting off flood the server > with traffic with very slow userland processing: > > Traffic source: > > for I in 1 2 3 4 5 6 7 8; do ( dd if=/dev/zero bs=8000 | nc -l 1111$I & > ); done > > Test system: > > 8* > nc source 1111$I | throttle -k 1 > /dev/null > > On our systems this will reliably trigger denied 9k allocations. > > 3) With flow control enabled and header splitting on flood the server > with very small frames (200 bytes). (Using the same test as in case 1). > My aim is to tune the watermark here so that there are no frames dropped > due to BD shortages. > > > > > I am under the impression that the best solution is to tune the RX ring > so that flow control can be disabled but I not sure I could do this. > > > >>> My next question is, is it possible to increase the size of the RX ring >>> without switching to RSS? >>> >>> >>> >> I have a change I've been working on to allow RX/TX ring size >> to be adjusted through a sysctl. Let me pretty it up a bit and >> send it to you for test. You should be able to adjust the ring >> size without enabling RSS. >> >> >> > If you can provide a patch I have hardware available to test on. > > Thanks > > Tom > > -- TJU13-ARIN
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C9BABA4.1060805>