Date: Tue, 24 Sep 2013 01:46:46 +0300 From: Sami Halabi <sodynet1@gmail.com> To: "Alexander V. Chernikov" <melifaro@yandex-team.ru> Cc: Adrian Chadd <adrian@freebsd.org>, Andre Oppermann <andre@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>, "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>, Luigi Rizzo <luigi@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>, FreeBSD Net <net@freebsd.org> Subject: Re: Network stack changes Message-ID: <CAEW%2BogZttyScUBQQWht%2BYGfLEDU_APcoRyYeMy_wDseAcZwVnA@mail.gmail.com> In-Reply-To: <523F4F14.9090404@yandex-team.ru> References: <521E41CB.30700@yandex-team.ru> <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d%2BUg@mail.gmail.com> <523F4F14.9090404@yandex-team.ru>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff> I've tried the diff in 10-current, applied cleanly but had errors compiling new kernel... is there any work to make it work? i'd love to test it. Sami On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > On 29.08.2013 15:49, Adrian Chadd wrote: > >> Hi, >> > Hello Adrian! > I'm very sorry for the looong reply. > > > >> There's a lot of good stuff to review here, thanks! >> >> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to >> keep locking things like that on a per-packet basis. We should be able to >> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and >> convert the interrupt handler to a fast handler that just schedules that >> taskqueue. We can ignore the ithread entirely here. >> >> What do you think? >> > Well, it sounds good :) But performance numbers and Jack opinion is more > important :) > > Are you going to Malta? > > >> Totally pie in the sky handwaving at this point: >> >> * create an array of mbuf pointers for completed mbufs; >> * populate the mbuf array; >> * pass the array up to ether_demux(). >> >> For vlan handling, it may end up populating its own list of mbufs to push >> up to ether_demux(). So maybe we should extend the API to have a bitmap of >> packets to actually handle from the array, so we can pass up a larger array >> of mbufs, note which ones are for the destination and then the upcall can >> mark which frames its consumed. >> >> I specifically wonder how much work/benefit we may see by doing: >> >> * batching packets into lists so various steps can batch process things >> rather than run to completion; >> * batching the processing of a list of frames under a single lock >> instance - eg, if the forwarding code could do the forwarding lookup for >> 'n' packets under a single lock, then pass that list of frames up to >> inet_pfil_hook() to do the work under one lock, etc, etc. >> > I'm thinking the same way, but we're stuck with 'forwarding lookup' due to > problem with egress interface pointer, as I mention earlier. However it is > interesting to see how much it helps, regardless of locking. > > Currently I'm thinking that we should try to change radix to something > different (it seems that it can be checked fast) and see what happened. > Luigi's performance numbers for our radix are too awful, and there is a > patch implementing alternative trie: > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff> > > > > >> Here, the processing would look less like "grab lock and process to >> completion" and more like "mark and sweep" - ie, we have a list of frames >> that we mark as needing processing and mark as having been processed at >> each layer, so we know where to next dispatch them. >> >> I still have some tool coding to do with PMC before I even think about >> tinkering with this as I'd like to measure stuff like per-packet latency as >> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P / >> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.) >> > That will be great to see! > >> >> Thanks, >> >> >> >> -adrian >> >> > ______________________________**_________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org> > " > -- Sami Halabi Information Systems Engineer NMS Projects Expert FreeBSD SysAdmin Expert On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov < melifaro@yandex-team.ru> wrote: > On 29.08.2013 15:49, Adrian Chadd wrote: > >> Hi, >> > Hello Adrian! > I'm very sorry for the looong reply. > > > >> There's a lot of good stuff to review here, thanks! >> >> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to >> keep locking things like that on a per-packet basis. We should be able to >> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and >> convert the interrupt handler to a fast handler that just schedules that >> taskqueue. We can ignore the ithread entirely here. >> >> What do you think? >> > Well, it sounds good :) But performance numbers and Jack opinion is more > important :) > > Are you going to Malta? > > >> Totally pie in the sky handwaving at this point: >> >> * create an array of mbuf pointers for completed mbufs; >> * populate the mbuf array; >> * pass the array up to ether_demux(). >> >> For vlan handling, it may end up populating its own list of mbufs to push >> up to ether_demux(). So maybe we should extend the API to have a bitmap of >> packets to actually handle from the array, so we can pass up a larger array >> of mbufs, note which ones are for the destination and then the upcall can >> mark which frames its consumed. >> >> I specifically wonder how much work/benefit we may see by doing: >> >> * batching packets into lists so various steps can batch process things >> rather than run to completion; >> * batching the processing of a list of frames under a single lock >> instance - eg, if the forwarding code could do the forwarding lookup for >> 'n' packets under a single lock, then pass that list of frames up to >> inet_pfil_hook() to do the work under one lock, etc, etc. >> > I'm thinking the same way, but we're stuck with 'forwarding lookup' due to > problem with egress interface pointer, as I mention earlier. However it is > interesting to see how much it helps, regardless of locking. > > Currently I'm thinking that we should try to change radix to something > different (it seems that it can be checked fast) and see what happened. > Luigi's performance numbers for our radix are too awful, and there is a > patch implementing alternative trie: > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff> > > > > >> Here, the processing would look less like "grab lock and process to >> completion" and more like "mark and sweep" - ie, we have a list of frames >> that we mark as needing processing and mark as having been processed at >> each layer, so we know where to next dispatch them. >> >> I still have some tool coding to do with PMC before I even think about >> tinkering with this as I'd like to measure stuff like per-packet latency as >> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P / >> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.) >> > That will be great to see! > >> >> Thanks, >> >> >> >> -adrian >> >> > ______________________________**_________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net> > To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org> > " > -- Sami Halabi Information Systems Engineer NMS Projects Expert FreeBSD SysAdmin Expert
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAEW%2BogZttyScUBQQWht%2BYGfLEDU_APcoRyYeMy_wDseAcZwVnA>