Date: Tue, 27 Jan 2015 17:38:11 -0600 From: Jim Thompson <jim@netgate.com> To: =?utf-8?Q?Antoine_Beaupr=C3=A9?= <anarcat@koumbit.org> Cc: freebsd-net@freebsd.org, wishmaster <artemrts@ukr.net> Subject: Re: is polling still a thing? Message-ID: <6BB47230-9AB8-4F0B-843B-7C51330F8306@netgate.com> In-Reply-To: <87pp9zc1wk.fsf@marcos.anarc.at> References: <871tmgceup.fsf@marcos.anarc.at> <1422384769.867067950.y2iiuu53@frv34.fwdcdn.com> <87pp9zc1wk.fsf@marcos.anarc.at>
next in thread | previous in thread | raw e-mail | index | archive | help
> On Jan 27, 2015, at 4:08 PM, Antoine Beaupr=C3=A9 = <anarcat@koumbit.org> wrote: >=20 > On 2015-01-27 13:57:20, wishmaster wrote: >> Have you consider to use netmap-based ipfw instead pf in DDoS = mitigation? I think you should. And without any network ''haks'' like = polling. >=20 > My understanding of netmap was that it wasn't useful for packet > forwarding, because its design is for transmitting packets directly to > userland faster, whereas routers dataflow stay mostly in the router=E2=80= =A6 the problem is that the =E2=80=9Cdata flow=E2=80=9D in freebsd isn=E2=80=99= t very fast. (I=E2=80=99d go so far to say, =E2=80=9Cbroken=E2=80=9D, = but that=E2=80=99s throwing rocks.) But as long as the window is already broken: the rtentry locking is a good example of how the stack is broken. the lack of FIB caching is another issue and the packet-at-a-time-to-completion is another. (no batching) So =E2=80=99N=E2=80=99 packets worth of address lookups, (ACLs, =E2=80=A6,= etc) at a time. Just like =E2=80=9CClick=E2=80=9D showed a decade ago = (and where the polling mode was of use). But it=E2=80=99s trivial to build a packet forwarder (more L2 than L3, = but all things are possible) using netmap (or dpdk) that smacks the = freebsd (and linux) stacks with a large stick. The netmap code comes with a =E2=80=9Cbridge.c=E2=80=9D example that is = just that, a dead-simple bridge. Another example, =E2=80=9Cnetmap-fwd=E2=80= =9D runs at 14.88Mpps between two 10Gbps interfaces. (Neither pf or the kernel-resident ipfw will come close, both are more = than an order of magnitude slower.) Here=E2=80=99s something a bit more than =E2=80=9Cdead simple=E2=80=9D: = https://github.com/caladri/brilter <https://github.com/caladri/brilter> This would be even faster if Juli would use one of the Lua JITs, e.g.: = http://wingolog.org/archives/2014/09/02/high-performance-packet-filtering-= with-pflua And if you want to go =E2=80=98full tilt=E2=80=99, Click runs on top of = netmap since 2012: https://github.com/kohler/click/commits/netmap = <https://github.com/kohler/click/commits/netmap> (the code is in the = master branch, too. use master.) As for the netmap-ipfw code, it=E2=80=99s 6.5Mpps to 10Mpps (later = editions of the code: = http://freebsd.1045724.n5.nabble.com/ipfw-meets-netmap-6-5-Mpps-in-userspa= ce-td5734014.html = <http://freebsd.1045724.n5.nabble.com/ipfw-meets-netmap-6-5-Mpps-in-usersp= ace-td5734014.html> > I'm hesitant in switching back to ipfw, considering how nice the > featureset and syntax of pf is. But if that's what's needed to restore > sanity=E2=80=A6 pf is sane? No, I don=E2=80=99t think so. (yes, it does say =E2=80=9Cpf=E2=80=9D at the front of =E2=80=9CpfSense=E2= =80=9D. so what? I mean, have you looked at the code?) Turn off polling, unless you know you need it. You=E2=80=99ll know you = =E2=80=98need it=E2=80=9D if you start making changes to the stack. There is a lot of =E2=80=9Cmystery meat=E2=80=9D in most fields, and the = field of computers / operating systems contains it=E2=80=99s share. As a somewhat associated example, Intel says, "hyperthreading helps = (networking) performance!=E2=80=9D 6wind says this too. freebsd = developers say, "hyperthreading hurts performance=E2=80=9D. In the end, it depends what is stalling the CPU. Hyper-threading is a = trick to share the write pipes on the core, and traditional = implementations of memcpy() will fill these pipes (call them buffers if = you like.) And the stack does a lot of =E2=80=9Cmemcpy()=E2=80=9D (I=E2=80=99m = waiting for the yowls of =E2=80=9Cwe zero-copy!=E2=80=9D, because anyone = who asserts this just hasn=E2=80=99t looked at the stack.) There are = tricks (if your code is interleaving access to the write pipes well, = you=E2=80=99ll see more benefit. This really wants cache-aligned data = structures, etc.) So, that=E2=80=99s just a long-winded =E2=80=9CYMMV=E2=80=9D. Jim
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6BB47230-9AB8-4F0B-843B-7C51330F8306>