From owner-freebsd-arch@FreeBSD.ORG  Sun Sep 22 19:59:44 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 1D1CDC67;
 Sun, 22 Sep 2013 19:59:44 +0000 (UTC)
 (envelope-from melifaro@yandex-team.ru)
Received: from forward-corp1g.mail.yandex.net (forward-corp1g.mail.yandex.net
 [IPv6:2a02:6b8:0:1402::10])
 by mx1.freebsd.org (Postfix) with ESMTP id 69E7022F4;
 Sun, 22 Sep 2013 19:59:43 +0000 (UTC)
Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net
 [95.108.252.2])
 by forward-corp1g.mail.yandex.net (Yandex) with ESMTP id DD945366009A;
 Sun, 22 Sep 2013 23:59:40 +0400 (MSK)
Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1])
 by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 1E1F82C0058;
 Sun, 22 Sep 2013 23:59:40 +0400 (MSK)
Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net
 [95.108.170.36])
 by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id vYfnD1rFej-xei4WIF6;
 Sun, 22 Sep 2013 23:59:40 +0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default; 
 t=1379879980; bh=PTdCNocghNrGKUJpHQ2f83+w9MCRgnEAdFhS+RlHncU=;
 h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:
 References:In-Reply-To:Content-Type:Content-Transfer-Encoding;
 b=m6k8ItAvMBHUq9A03HVz2UgoLshfWFr7lzQILEivApjPY3CedK7sTAjtsEYuN6dK2
 cqo8EiYKcUUqHgbvGbjhOwqPb2rZmcIQEpWaRpLv+OQdJsfpFLlXSuBZOmJWZ67rA6
 lbBAKa1YMckWMMC+v7GnojZuQt6OwTrLUKzTw3+o=
Authentication-Results: smtpcorp4.mail.yandex.net;
 dkim=pass header.i=@yandex-team.ru
Message-ID: <523F4BED.6000804@yandex-team.ru>
Date: Sun, 22 Sep 2013 23:58:37 +0400
From: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130824 Thunderbird/17.0.8
MIME-Version: 1.0
To: Andre Oppermann <andre@freebsd.org>
Subject: Re: Network stack changes
References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org>
In-Reply-To: <521E78B0.6080709@freebsd.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Cc: adrian@freebsd.org, freebsd-hackers@freebsd.org,
 FreeBSD Net <net@freebsd.org>, luigi@freebsd.org, ae@FreeBSD.org,
 Gleb Smirnoff <glebius@FreeBSD.org>, freebsd-arch@freebsd.org
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Sep 2013 19:59:44 -0000

On 29.08.2013 02:24, Andre Oppermann wrote:
> On 28.08.2013 20:30, Alexander V. Chernikov wrote:
>> Hello list!
>
> Hello Alexander,
Hello Andre!
I'm very sorry to answer so late.
>
> you sent quite a few things in the same email.  I'll try to respond
> as much as I can right now.  Later you should split it up to have
> more in-depth discussions on the individual parts.
>
> If you could make it to the EuroBSDcon 2013 DevSummit that would be
> even more awesome.  Most of the active network stack people will be
> there too.
I've sent presentation describing nearly the same things to devsummit@ 
so I hope this can be discussed in Networking group.
I hope to attend DevSummit & EuroBSDcon.
>
>> There is a lot constantly raising discussions related to networking 
>> stack performance/changes.
>>
>> I'll try to summarize current problems and possible solutions from my 
>> point of view.
>> (Generally this is one problem: stack is 
>> slooooooooooooooooooooooooooow, but we need to know why and
>> what to do).
>
> Compared to others its not thaaaaaaat slow. ;)
>
>> Let's start with current IPv4 packet flow on a typical router:
>> http://static.ipfw.ru/images/freebsd_ipv4_flow.png
>>
>> (I'm sorry I can't provide this as text since Visio don't have any 
>> 'ascii-art' exporter).
>>
>> Note that we are using process-to-completion model, e.g. process any 
>> packet in ISR until it is either
>> consumed by L4+ stack or dropped or put to egress NIC queue.
>>
>> (There is also deferred ISR model implemented inside netisr but it 
>> does not change much:
>> it can help to do more fine-grained hashing (for GRE or other similar 
>> traffic), but
>> 1) it uses per-packet mutex locking which kills all performance
>> 2) it currently does not have _any_ hashing functions (see absence of 
>> flags in `netstat -Q`)
>> People using http://static.ipfw.ru/patches/netisr_ip_flowid.diff (or 
>> modified PPPoe/GRE version)
>> report some profit, but without fixing (1) it can't help much
>> )
>>
>> So, let's start:
>>
>> 1) Ixgbe uses mutex to protect each RX ring which is perfectly fine 
>> since there is nearly no contention
>> (the only thing that can happen is driver reconfiguration which is 
>> rare and, more signifficant, we
>> do this once
>> for the batch of packets received in given interrupt). However, due 
>> to some (im)possible deadlocks
>> current code
>> does per-packet ring unlock/lock (see ixgbe_rx_input()).
>> There was a discussion ended with nothing:
>> http://lists.freebsd.org/pipermail/freebsd-net/2012-October/033520.html
>>
>> 1*) Possible BPF users. Here we have one rlock if there are any 
>> readers present
>> (and mutex for any matching packets, but this is more or less OK. 
>> Additionally, there is WIP to
>> implement multiqueue BPF
>> and there is chance that we can reduce lock contention there).
>
> Rlock to rmlock?
Yes, probably.
>
>> There is also an "optimize_writers" hack permitting applications
>> like CDP to use BPF as writers but not registering them as receivers 
>> (which implies rlock)
>
> I believe longer term we should solve this with a protocol type 
> "ethernet"
> so that one can send/receive ethernet frames through a normal socket.
Yes. AF_LINK or any similar.
>
>> 2/3) Virtual interfaces (laggs/vlans over lagg and other simular 
>> constructions).
>> Currently we simply use rlock to make s/ix0/lagg0/ and, what is much 
>> more funny - we use complex
>> vlan_hash with another rlock to
>> get vlan interface from underlying one.
>>
>> This is definitely not like things should be done and this can be 
>> changed more or less easily.
>
> Indeed.
>
>> There are some useful terms/techniques in world of software/hardware 
>> routing: they have clear
>> 'control plane' and 'data plane' separation.
>> Former one is for dealing control traffic (IGP, MLD, IGMP snooping, 
>> lagg hellos, ARP/NDP, etc..) and
>> some data traffic (packets with TTL=1, with options, destined to 
>> hosts without ARP/NDP record, and
>> similar). Latter one is done in hardware (or effective software 
>> implementation).
>> Control plane is responsible to provide data for efficient data plane 
>> operations. This is the point
>> we are missing nearly everywhere.
>
> ACK.
>
>> What I want to say is: lagg is pure control-plane stuff and vlan is 
>> nearly the same. We can't apply
>> this approach to complex cases like 
>> lagg-over-vlans-over-vlans-over-(pppoe_ng0-and_wifi0)
>> but we definitely can do this for most common setups like (igb* or 
>> ix* in lagg with or without vlans
>> on top of lagg).
>
> ACK.
>
>> We already have some capabilities like VLANHWFILTER/VLANHWTAG, we can 
>> add some more. We even have
>> per-driver hooks to program HW filtering.
>
> We could.  Though for vlan it looks like it would be easier to remove the
> hardware vlan tag stripping and insertion.  It only adds complexity in 
> all
> drivers for no gain.
No. Actually as far as I understand it helps driver to perform TSO. 
Anyway, IMO we should use HW capabilities if we can.
(this probably does not add much speed on 1G, but on 10/20/40G this can 
help much more).
>
>> One small step to do is to throw packet to vlan interface directly 
>> (P1), proof-of-concept(working in
>> production):
>> http://lists.freebsd.org/pipermail/freebsd-net/2013-April/035270.html
>>
>> Another is to change lagg packet accounting:
>> http://lists.freebsd.org/pipermail/svn-src-all/2013-April/067570.html
>> Again, this is more like HW boxes do (aggregate all counters 
>> including errors) (and I can't imagine
>> what real error we can get from _lagg_).
> >
>> 4) If we are router, we can do either slooow ip_input() -> 
>> ip_forward() -> ip_output() cycle or use
>> optimized ip_fastfwd() which falls back to 'slow' path for 
>> multicast/options/local traffic (e.g.
>> works exactly like 'data plane' part).
>> (Btw, we can consider net.inet.ip.fastforwarding to be turned on by 
>> default at least for non-IPSEC
>> kernels)
>
> ACK.
>
>> Here we have to determine if this is local packet or not, e.g. 
>> F(dst_ip) returning 1 or 0. Currently
>> we are simply using standard rlock + hash of iface addresses.
>> (And some consumers like ipfw(4) do the same, but without lock).
>> We don't need to do this! We can build sorted array of IPv4 addresses 
>> or other efficient structure
>> on every address change and use it unlocked with delayed garbage 
>> collection (proof-of-concept attached)
>
> I'm a bit uneasy with unlocked access.  On very weakly ordered 
> architectures
> this could trip over cache coherency issues.  A rmlock is essentially 
> for free
> in the read case.
Well, I'm talking of
1) allocate _new_ memory (unlocked)
2) commit _new_ copy for given address list (rlock)
3) change pointer. As fa as I understand we can read either old or new value
4) use delayed GC (how much should be wait, until deletion)

Anyway, protecting (optimized) list with rmlock can do.
>
>> (There is another thing to discuss: maybe we can do this once 
>> somewhere in ip_input and mark mbuf as
>> 'local/non-local' ? )
>
> The problem is packet filters may change the destination address and thus
> can invalidate such a lookup.
Yes. So ether filter or ip_input() routing should re-inspect packet, 
exactly like this is done currently (ipfw fwd for IPv4/IPv6 code)
>
>> 5, 9) Currently we have L3 ingress/egress PFIL hooks protected by 
>> rmlocks. This is OK.
>>
>> However, 6) and 7) are not.
>> Firewall can use the same pfil lock as reader protection without 
>> imposing its own lock. currently
>> pfil&ipfw code is ready to do this.
>
> The problem with the global pfil rmlock is the comparatively long time it
> is held in a locked state.  Also packet filters may have to acquire 
> additional
> locks when they have to modify state tables.  Rmlocks are not made for 
> that
> because they pin the thread to the cpu they're currently on.  This is 
> what
> Gleb is complaining about.
Yes, additional locks is the problem
>
> My idea is to hold the pfil rmlock only for the lookup of the first/next
> packet filter that will run, not for the entire duration.  That would 
> solve
> the problem.  However packets filter then have to use their own locks 
> again,
> which could be rmlock too.
Well, we haven't changed anything yet :)
>
>> 8) Radix/rt* api. This is probably the worst place in entire stack. 
>> It is toooo generic, tooo slow
>> and buggy (do you use IPv6? you definitely know what I'm talking about).
>> A) It really is too generic and assumption that it can be 
>> (effectively) used for every family is
>> wrong. Two examples:
>> we don't need to lookup all 128 bits of IPv6 address. Subnets with 
>> mask >/64 are not used widely
>> (actually the only reason to use them are p2p links due to ND 
>> potential problems).
>> One of common solutions is to lookup 64bits, and build another trie 
>> (or other structure) in case of
>> collision.
>> Another example is MPLS where we can simply do direct array lookup 
>> based on ingress label.
>
> Yes.  While we shouldn't throw it out, it should be run as RIB and
> allow a much more protocol specific FIB for the hot packet path.
>
>> B) It is terribly slow (AFAIR luigi@ did some performance management, 
>> numbers available in one of
>> netmap pdfs)
>
> Again not thaaaat slow but inefficient enough.
I've found the paper I was talking about:
http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf

It claims that our radix is able to do 6MPPS/core and it does not scale 
with number of cores.
>
>> C) It is not multipath-capable. Stateful (and non-working) multipath 
>> is definitely not the right way.
>
> Indeed.
>
>> 8*) rtentry
>> We are doing it wrong.
>> Currently _every_ lookup locks/unlocks given rte twice.
>> First lock is related to and old-old story for trusting IP redirects 
>> (and auto-adding host routes
>> for them). Hopefully currently it is disabled automatically when you 
>> turn forwarding on.
>
> They're disabled.
>
>> The second one is much more complicated: we are assuming that rte's 
>> with non-zero refcount value can
>> stop egress interface from being destroyed.
>> This is wrong (but widely used) assumption.
>
> Not really.  The reason for the refcount is not the ifp reference but
> other code parts that may hold direct pointers to the rtentry and do
> direct dereferencing to access information in it.
Yes, but what information?
>
>> We can use delayed GC instead of locking for rte's and this won't 
>> break things more than they are
>> broken now (patch attached).
>
> Nope.  Delayed GC is not the way to go here.  To do away with rtentry
> locking and refcounting we have change rtalloc(9) to return the 
> information
> the caller wants (e.g. ifp, ia, others) and not the rtentry address 
> anymore.
> So instead of rtalloc() we have rtlookup().
It depends on what we want to do next..
My idea (briefly) is to have
1) adjacency/nhops structures describing next hops with rewrite info and 
list of iface indices to do L2 multipath
2) "rtentry" to have link to array of nhops to do L3 multipath (more or 
less the same as Cisco CEF and others).

And, anyway, we still have to protect from interface departure.
>
>> We can't do the same for ifp structures since
>> a) virtual ones can assume some state in underlying physical NIC
>> b) physical ones just _can_ be destroyed (maybe regardless of user 
>> wants this or not, like: SFP
>> being unplugged from NIC) or simply lead to kernel crash due to SW/HW 
>> inconsistency
>
> Here I actually believe we can do a GC or stable storage based approach.
> Ifp pointers are kept in too many places and properly refcounting it is
> very (too) hard.  So whenever an interface gets destroyed or disappears
> it's callable function pointers are replaced with dummies returning an
> error.  The ifp in memory will stay for some time and even may be reused
Yes. But we are not holding any (relevant) lock while doing actual 
transmit (e.g. calling if_output after performing L2 rewrite in 
ether_output) so
some cores will see old pointers..
> for another new interface later again (Cisco does it that way in their 
> IOS).
>
>> One of possible solution is to implement stable refcounts based on 
>> PCPU counters, and apply thos
>> counters to ifp, but seem to be non-trivial.
>>
>>
>> Another rtalloc(9) problem is the fact that radix is used as both 
>> 'control plane' and 'data plane'
>> structure/api. Some users always want to put more information in rte, 
>> while others
>> want to make rte more compact. We just need _different_ structures 
>> for that.
>
> ACK.
>
>> Feature-rich, lot-of-data control plane one (to store everything we 
>> want to store, including, for
>> example, PID of process originating the route) - current radix can be 
>> modified to do this.
>> And address-family-depended another structure (array, trie, or 
>> anything) which contains _only_ data
>> necessary to put packet on the wire.
>
> ACK.
>
>> 11) arpresolve. Currently (this was decoupled in 8.x) we have
>> a) ifaddr rlock
>> b) lle rlock.
>>
>> We don't need those locks.
>> We need to
>> a) make lle layer per-interface instead of global (and this can also 
>> solve multiple fibs and L2
>> mappings done in fib.0 issue)
>
> Yes!
>
>> b) use rtalloc(9)-provided lock instead of separate locking
>
> No.  Interface rmlock.
Discussable :)
>
>> c) actually, we need to do rewrite this layer because
>> d) lle actually is the place to do real multipath:
>
> No, you can do multipath through more than one interface.  If lle is
> per interface that wont work and is not the right place.
>
>> briefly,
>> you have rte pointing to some special nexthop structure pointing to 
>> lle, which has the following data:
>> num_of_egress_ifaces: [ifindex1, ifindex2, ifindex3] | L2 data to 
>> prepend to header
>> Separate post will follow.
>
> This should be part of the RIB/FIB and select on of the ifp+nexthops
> to return on lookup.
Yes.
>
>> With the following, we can achieve lagg traffic distribution without 
>> actually using lagg_transmit
>> and similar stuff (at least in most common scenarious)
>
> This seems to be a rather nasty layering violation.
Not really. lagg is pure virtual stuff.
>
>> (for example, TCP output definitely can benefit from this, since we 
>> can account flowid once for TCP
>> session and use in in every mbuf)
> >
>> So. Imagine we have done all this. How we can estimate the difference?
>>
>> There was a thread, started a year ago, describing 'stock' 
>> performance and difference for various
>> modifications.
>> It is done on 8.x, however I've got similar results on recent 9.x
>>
>> http://lists.freebsd.org/pipermail/freebsd-net/2012-July/032680.html
>>
>> Briefly:
>>
>> 2xE5645 @ Intel 82599 NIC.
>> Kernel: FreeBSD-8-S r237994, stock drivers, stock routing, no 
>> FLOWTABLE, no firewallIxia XM2
>> (traffic generator) <> ix0 (FreeBSD). Ixia sends 64byte IP packets 
>> from vlan10 (10.100.0.64 -
>> 10.100.0.156) to destinations in vlan11 (10.100.1.128 - 
>> 10.100.1.192). Static arps are configured
>> for all destination addresses. Traffic level is slightly above or 
>> slightly below system performance.
>>
>> we start from 1.4MPPS (if we are using several routes to minimize 
>> mutex contention).
>>
>> My 'current' result for the same test, on same HW, with the following 
>> modifications:
>>
>> * 1) ixgbe per-packet ring unlock removed
>> * P1) ixgbe is modified to do direct vlan input (so 2,3 are not used)
>> * 4) separate lockless in_localip() version
>> * 6) - using existing pfil lock
>> * 7) using lockless version
>> * 8) radix converted to use rmlock instead of rlock. Delayed GC is 
>> used instead of mutexes
>> * 10) - using existing pfil lock
>> * 11) using radix lock to do arpresolve(). Not using lle rlock
>>
>> (so the rmlocks are the only locks used on data path).
>>
>> Additionally, ipstat counters are converted to PCPU (no real 
>> performance implications).
>> ixgbe does not do per-packet accounting (as in head).
>> if_vlan counters are converted to PCPU
>> lagg is converted to rmlock, per-packet accounting is removed (using 
>> stat from underlying interfaces)
>> lle hash size is bumped to 1024 instead of 32 (not applicable here, 
>> but slows things down for large
>> L2 domains)
>>
>> The result is 5.6 MPPS for single port (11 cores) and 6.5MPPS for 
>> lagg (16 cores), nearly the same
>> for HT on and 22 cores.
>
> That's quite good, but we want more. ;)
>
>> ..
>> while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) 
>> on the same-class hardware and
>> _userland_ forwarding.
>
> Those numbers sound a bit far out.  Maybe if the packet isn't touched
> or looked at at all in a pure netmap interface to interface bridging
> scenario.  I don't believe these numbers.
http://www.intel.com/content/dam/www/public/us/en/documents/presentation/dpdk-packet-processing-ia-overview-presentation.pdf
Luigi talks about very fast L4 lookups in his (and other colleagues) work.
Anyway, even simple 8-8-8-8 multi-bit trie can be very fast

>
>> One of key features making all such products possible (DPDK, netmap, 
>> packetshader, Cisco SW
>> forwarding) - is use of batching instead of process-to-completion model.
>> Batching mitigates locking cost, batching does not wash out CPU 
>> cache, and so on.
>
> The work has to be done eventually.  Batching doesn't relieve from it.
> IMHO batch moving is only the last step would should look at.  It makes
> the stack rather complicated and introduces other issues like packet
> latency.
>
>> So maybe we can consider passing batches from NIC to at least L2 
>> layer with netisr? or even up to
>> ip_input() ?
>
> And then?  You probably won't win much in the end (if the lock path
> is optimized).
At least I can firewall them "all at once". Next steps depends on how we 
can solve egress ifp problem.
But yes, this is definitely not the first thing to do.
>
>> Another question is about making some sort of reliable GC like 
>> ("passive serialization" or other
>> similar not-to-pronounce-words about Linux and lockless objects).
>
> Rmlocks are our secret weapon and just as good.
>
>> P.S. Attached patches are 1) for 8.x 2) mostly 'hacks' showing 
>> roughly how can this be done and what
>> benefit can be achieved.
>


From owner-freebsd-arch@FreeBSD.ORG  Sun Sep 22 20:02:22 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id D762FFB8;
 Sun, 22 Sep 2013 20:02:22 +0000 (UTC)
 (envelope-from melifaro@yandex-team.ru)
Received: from forward-corp1g.mail.yandex.net (forward-corp1g.mail.yandex.net
 [IPv6:2a02:6b8:0:1402::10])
 by mx1.freebsd.org (Postfix) with ESMTP id 827D92358;
 Sun, 22 Sep 2013 20:02:22 +0000 (UTC)
Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net
 [95.108.252.2])
 by forward-corp1g.mail.yandex.net (Yandex) with ESMTP id C54B9366009A;
 Mon, 23 Sep 2013 00:02:20 +0400 (MSK)
Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1])
 by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 2EF512C0337;
 Mon, 23 Sep 2013 00:02:20 +0400 (MSK)
Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net
 [95.108.170.36])
 by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id sZ8d66c12V-2KiurqgF;
 Mon, 23 Sep 2013 00:02:20 +0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default; 
 t=1379880140; bh=ckqWsTOIe/Kq2fUVwOM8tdSMGh1wrt74DVOIcVSoROQ=;
 h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:
 References:In-Reply-To:Content-Type:Content-Transfer-Encoding;
 b=UqhwclJ5XqEh3ojfScL1sDTZYC6IyJM1t6cWll5wv/fCnjrOTu4KyAfG/wQzY9rkf
 FNwVc0pgaxTowGHI03/Hgr3+l9ByEYnLs4ZAX36XWRwghuhfnKgXNNKA8RXVJZ8xyb
 lztFaDePItV+hzB91UQp9tJrQKpHpBB8VWUc+YRI=
Authentication-Results: smtpcorp4.mail.yandex.net;
 dkim=pass header.i=@yandex-team.ru
Message-ID: <523F4C8D.6080903@yandex-team.ru>
Date: Mon, 23 Sep 2013 00:01:17 +0400
From: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130824 Thunderbird/17.0.8
MIME-Version: 1.0
To: Slawa Olhovchenkov <slw@zxy.spb.ru>
Subject: Re: Network stack changes
References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org>
 <20130829013241.GB70584@zxy.spb.ru>
In-Reply-To: <20130829013241.GB70584@zxy.spb.ru>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: adrian@freebsd.org, Andre Oppermann <andre@freebsd.org>,
 freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org, luigi@freebsd.org,
 ae@FreeBSD.org, Gleb Smirnoff <glebius@FreeBSD.org>,
 FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Sep 2013 20:02:22 -0000

On 29.08.2013 05:32, Slawa Olhovchenkov wrote:
> On Thu, Aug 29, 2013 at 12:24:48AM +0200, Andre Oppermann wrote:
>
>>> ..
>>> while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the same-class hardware and
>>> _userland_ forwarding.
>> Those numbers sound a bit far out.  Maybe if the packet isn't touched
>> or looked at at all in a pure netmap interface to interface bridging
>> scenario.  I don't believe these numbers.
> 80*64*8 = 40.960 Gb/s
> May be DCA? And use CPU with 40 PCIe lane and 4 memory chanell.
Intel introduces DDIO instead of DCA: 
http://www.intel.com/content/www/us/en/io/direct-data-i-o.html
(and it seems DCA does not help much):
https://www.myricom.com/software/myri10ge/790-how-do-i-enable-intel-direct-cache-access-dca-with-the-linux-myri10ge-driver.html
https://www.myricom.com/software/myri10ge/783-how-do-i-get-the-best-performance-with-my-myri-10g-network-adapters-on-a-host-that-supports-intel-data-direct-i-o-ddio.html

(However, DPDK paper notes DDIO is of signifficant helpers)

From owner-freebsd-arch@FreeBSD.ORG  Sun Sep 22 20:13:11 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 956362E0;
 Sun, 22 Sep 2013 20:13:11 +0000 (UTC)
 (envelope-from melifaro@yandex-team.ru)
Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net
 [IPv6:2a02:6b8:0:801::10])
 by mx1.freebsd.org (Postfix) with ESMTP id 0AEAD23F6;
 Sun, 22 Sep 2013 20:13:11 +0000 (UTC)
Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net
 [95.108.252.2])
 by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id 9743C2420022;
 Mon, 23 Sep 2013 00:13:07 +0400 (MSK)
Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1])
 by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id CB2EB2C032B;
 Mon, 23 Sep 2013 00:13:06 +0400 (MSK)
Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net
 [95.108.170.36])
 by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id zPyEtHgBff-D6iqY3ni;
 Mon, 23 Sep 2013 00:13:06 +0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default; 
 t=1379880786; bh=lc1BwebjnC7LP0vJSar/INsu9fAdxzJlXjhZRROe3wc=;
 h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:
 References:In-Reply-To:Content-Type:Content-Transfer-Encoding;
 b=TiFhbX28OlyJCId0ZK+VwWQp36YyQ+h5WJh3zgC8KN+0g0HD9q83tsYSnh8yv3F8L
 xX3AllLMhINn8mDOk4NZSsUIRGpnNhckim3gXJVni//gJQRXbaQZLY79Q45TEGh/pn
 8RzHV1D9iadYLhu84SNJrkjRmQNZV1G2ug+v8DQo=
Authentication-Results: smtpcorp4.mail.yandex.net;
 dkim=pass header.i=@yandex-team.ru
Message-ID: <523F4F14.9090404@yandex-team.ru>
Date: Mon, 23 Sep 2013 00:12:04 +0400
From: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130824 Thunderbird/17.0.8
MIME-Version: 1.0
To: Adrian Chadd <adrian@freebsd.org>
Subject: Re: Network stack changes
References: <521E41CB.30700@yandex-team.ru>
 <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d+Ug@mail.gmail.com>
In-Reply-To: <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d+Ug@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Luigi Rizzo <luigi@freebsd.org>, Andre Oppermann <andre@freebsd.org>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 FreeBSD Net <net@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>,
 Gleb Smirnoff <glebius@freebsd.org>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Sep 2013 20:13:11 -0000

On 29.08.2013 15:49, Adrian Chadd wrote:
> Hi,
Hello Adrian!
I'm very sorry for the looong reply.

>
> There's a lot of good stuff to review here, thanks!
>
> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to 
> keep locking things like that on a per-packet basis. We should be able 
> to do this in a cleaner way - we can defer RX into a CPU pinned 
> taskqueue and convert the interrupt handler to a fast handler that 
> just schedules that taskqueue. We can ignore the ithread entirely here.
>
> What do you think?
Well, it sounds good :) But performance numbers and Jack opinion is more 
important :)

Are you going to Malta?
>
> Totally pie in the sky handwaving at this point:
>
> * create an array of mbuf pointers for completed mbufs;
> * populate the mbuf array;
> * pass the array up to ether_demux().
>
> For vlan handling, it may end up populating its own list of mbufs to 
> push up to ether_demux(). So maybe we should extend the API to have a 
> bitmap of packets to actually handle from the array, so we can pass up 
> a larger array of mbufs, note which ones are for the destination and 
> then the upcall can mark which frames its consumed.
>
> I specifically wonder how much work/benefit we may see by doing:
>
> * batching packets into lists so various steps can batch process 
> things rather than run to completion;
> * batching the processing of a list of frames under a single lock 
> instance - eg, if the forwarding code could do the forwarding lookup 
> for 'n' packets under a single lock, then pass that list of frames up 
> to inet_pfil_hook() to do the work under one lock, etc, etc.
I'm thinking the same way, but we're stuck with 'forwarding lookup' due 
to problem with egress interface pointer, as I mention earlier. However 
it is interesting to see how much it helps, regardless of locking.

Currently I'm thinking that we should try to change radix to something 
different (it seems that it can be checked fast) and see what happened.
Luigi's performance numbers for our radix are too awful, and there is a 
patch implementing alternative trie:
http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf
http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff


>
> Here, the processing would look less like "grab lock and process to 
> completion" and more like "mark and sweep" - ie, we have a list of 
> frames that we mark as needing processing and mark as having been 
> processed at each layer, so we know where to next dispatch them.
>
> I still have some tool coding to do with PMC before I even think about 
> tinkering with this as I'd like to measure stuff like per-packet 
> latency as well as top-level processing overhead (ie, 
> CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC 
> interrupts on that core, etc.)
That will be great to see!
>
> Thanks,
>
>
>
> -adrian
>


From owner-freebsd-arch@FreeBSD.ORG  Sun Sep 22 22:06:04 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id ACEF0C12;
 Sun, 22 Sep 2013 22:06:04 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 by mx1.freebsd.org (Postfix) with ESMTP id 68C9E29BF;
 Sun, 22 Sep 2013 22:06:04 +0000 (UTC)
Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1VNrp0-000Llg-HY; Mon, 23 Sep 2013 02:08:06 +0400
Date: Mon, 23 Sep 2013 02:08:06 +0400
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
Subject: Re: Network stack changes
Message-ID: <20130922220806.GK3796@zxy.spb.ru>
References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org>
 <20130829013241.GB70584@zxy.spb.ru>
 <523F4C8D.6080903@yandex-team.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <523F4C8D.6080903@yandex-team.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
Cc: adrian@freebsd.org, Andre Oppermann <andre@freebsd.org>,
 freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org, luigi@freebsd.org,
 ae@FreeBSD.org, Gleb Smirnoff <glebius@FreeBSD.org>,
 FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Sep 2013 22:06:04 -0000

On Mon, Sep 23, 2013 at 12:01:17AM +0400, Alexander V. Chernikov wrote:

> On 29.08.2013 05:32, Slawa Olhovchenkov wrote:
> > On Thu, Aug 29, 2013 at 12:24:48AM +0200, Andre Oppermann wrote:
> >
> >>> ..
> >>> while Intel DPDK claims 80MPPS (and 6windgate talks about 160 or so) on the same-class hardware and
> >>> _userland_ forwarding.
> >> Those numbers sound a bit far out.  Maybe if the packet isn't touched
> >> or looked at at all in a pure netmap interface to interface bridging
> >> scenario.  I don't believe these numbers.
> > 80*64*8 = 40.960 Gb/s
> > May be DCA? And use CPU with 40 PCIe lane and 4 memory chanell.
> Intel introduces DDIO instead of DCA: 
> http://www.intel.com/content/www/us/en/io/direct-data-i-o.html
> (and it seems DCA does not help much):
> https://www.myricom.com/software/myri10ge/790-how-do-i-enable-intel-direct-cache-access-dca-with-the-linux-myri10ge-driver.html
> https://www.myricom.com/software/myri10ge/783-how-do-i-get-the-best-performance-with-my-myri-10g-network-adapters-on-a-host-that-supports-intel-data-direct-i-o-ddio.html
> 
> (However, DPDK paper notes DDIO is of signifficant helpers)

Ha, Intel paper say SMT is signifficant better HT. In real word --
same shit.

For network application, if buffring need more then L3 cache, what
happening? May be some bad things...

From owner-freebsd-arch@FreeBSD.ORG  Sun Sep 22 20:16:21 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 53BB2667;
 Sun, 22 Sep 2013 20:16:21 +0000 (UTC)
 (envelope-from melifaro@yandex-team.ru)
Received: from forward-corp1e.mail.yandex.net (forward-corp1e.mail.yandex.net
 [IPv6:2a02:6b8:0:202::10])
 by mx1.freebsd.org (Postfix) with ESMTP id EC30D242B;
 Sun, 22 Sep 2013 20:16:20 +0000 (UTC)
Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net
 [95.108.252.2])
 by forward-corp1e.mail.yandex.net (Yandex) with ESMTP id 95BAF640D0A;
 Mon, 23 Sep 2013 00:16:17 +0400 (MSK)
Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1])
 by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id 581372C032B;
 Mon, 23 Sep 2013 00:16:16 +0400 (MSK)
Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net
 [95.108.170.36])
 by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id jtjruyBg15-GGiaTnd3;
 Mon, 23 Sep 2013 00:16:16 +0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default; 
 t=1379880976; bh=gvzJN62AvNEgGoGNHIYfEvCzLEcKQ9CJd4v9pZemI00=;
 h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:
 References:In-Reply-To:Content-Type:Content-Transfer-Encoding;
 b=Gc+87YUS6ZIHr6f8OibtRuhzdovntf4M5P1jnv32kAeNRLqwz3/oJ2UxeURkPMrID
 rfYovH/KAOgmb5HMFh3SgBd2fcOQME6BrBubz6ELgGGcqEiBEEevu5QbGoqcve1PRb
 tuPOPlllk/24RD9I7z4/G5jCBW9yrIuBlc0AVoao=
Authentication-Results: smtpcorp4.mail.yandex.net;
 dkim=pass header.i=@yandex-team.ru
Message-ID: <523F4FD1.6060807@yandex-team.ru>
Date: Mon, 23 Sep 2013 00:15:13 +0400
From: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64;
 rv:17.0) Gecko/20130824 Thunderbird/17.0.8
MIME-Version: 1.0
To: =?ISO-8859-1?Q?Olivier_Cochard-Labb=E9?= <olivier@cochard.me>
Subject: Re: Network stack changes
References: <521E41CB.30700@yandex-team.ru>
 <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d+Ug@mail.gmail.com>
 <6BDA4619-783C-433E-9819-A7EAA0BD3299@neville-neil.com>
 <20130914142802.GC71010@onelab2.iet.unipi.it>
 <CA+q+TcqhoDnT1NgcCah+rpTzhfZ6rm5mQ7qh8BpUNn50Nb_vDA@mail.gmail.com>
In-Reply-To: <CA+q+TcqhoDnT1NgcCah+rpTzhfZ6rm5mQ7qh8BpUNn50Nb_vDA@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Mailman-Approved-At: Sun, 22 Sep 2013 22:06:49 +0000
Cc: Adrian Chadd <adrian@freebsd.org>, Andre Oppermann <andre@freebsd.org>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 George Neville-Neil <gnn@neville-neil.com>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>,
 Luigi Rizzo <luigi@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>,
 Gleb Smirnoff <glebius@freebsd.org>, FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 22 Sep 2013 20:16:21 -0000

On 14.09.2013 22:49, Olivier Cochard-Labb� wrote:
> On Sat, Sep 14, 2013 at 4:28 PM, Luigi Rizzo <rizzo@iet.unipi.it> wrote:
>> IXIA ? For the timescales we need to address we don't need an IXIA,
>> a netmap sender is more than enough
>>
> The great netmap generates only one IP flow (same src/dst IP and same
> src/dst port).
> This don't permit to test multi-queue NIC (or SMP packet-filter) on a
> simple lab like this:
> netmap sender => freebsd router => netmap receiver
I've got the variant which is capable on doing linerate pcap replays on 
single queue.
(However this is true for small pcaps only)
>
> Regards,
>
> Olivier


From owner-freebsd-arch@FreeBSD.ORG  Mon Sep 23 04:42:35 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id D0597AD1;
 Mon, 23 Sep 2013 04:42:35 +0000 (UTC)
 (envelope-from adrian.chadd@gmail.com)
Received: from mail-wg0-x236.google.com (mail-wg0-x236.google.com
 [IPv6:2a00:1450:400c:c00::236])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 836B62B50;
 Mon, 23 Sep 2013 04:42:34 +0000 (UTC)
Received: by mail-wg0-f54.google.com with SMTP id m15so2690852wgh.9
 for <multiple recipients>; Sun, 22 Sep 2013 21:42:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:sender:in-reply-to:references:date:message-id:subject
 :from:to:cc:content-type;
 bh=P2w1WbOvUNizXn7ZsZS2TBooYqeg4cGtYAdYuHYhYXM=;
 b=QOwrZB5l04e+6jyPEqzodo92zbc+lgK+Acasb9IebLx+0pMt++4YSkD1Y69XGeJN47
 O5q78g4joGcHZCXb4Sn5q/+b8pgFZzoqCX3HbDHnjWLGeLaeIzvsMPRMJ2QjUd+DPgI4
 kQeWEDgL2puChSMNZNc0GP+Yrc5jMlRtlX16RU2tYuX0zwZ74ydipyKtRv7zVzP5SD2i
 QNIdpI9rcwdFWjQBTbPx24KSQcZX+2ykEAL8KEFOJpfUIDs6zPXLWk25mKbutnQm3DZW
 SKWgHGVY1h0IpgLyJvTbTq+B4vDL1URAj6tEcZU0Ne9IVvTY0BRjmiX7vQU/6ObMDegy
 GoVA==
MIME-Version: 1.0
X-Received: by 10.180.10.136 with SMTP id i8mr11840167wib.46.1379911352875;
 Sun, 22 Sep 2013 21:42:32 -0700 (PDT)
Sender: adrian.chadd@gmail.com
Received: by 10.216.73.133 with HTTP; Sun, 22 Sep 2013 21:42:32 -0700 (PDT)
In-Reply-To: <523F4F14.9090404@yandex-team.ru>
References: <521E41CB.30700@yandex-team.ru>
 <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d+Ug@mail.gmail.com>
 <523F4F14.9090404@yandex-team.ru>
Date: Sun, 22 Sep 2013 21:42:32 -0700
X-Google-Sender-Auth: Xk5o-T54T2C85UHlUNeXvbPizgo
Message-ID: <CAJ-VmokmaVds-DVj6yYAWSSwg=RBYJLeFfg8ae-Oj3_bOdcRyA@mail.gmail.com>
Subject: Re: Network stack changes
From: Adrian Chadd <adrian@freebsd.org>
To: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Luigi Rizzo <luigi@freebsd.org>, Andre Oppermann <andre@freebsd.org>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 FreeBSD Net <net@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>,
 Gleb Smirnoff <glebius@freebsd.org>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2013 04:42:36 -0000

Hi!


On 22 September 2013 13:12, Alexander V. Chernikov
<melifaro@yandex-team.ru>wrote:


>  I'm thinking the same way, but we're stuck with 'forwarding lookup' due
> to problem with egress interface pointer, as I mention earlier. However it
> is interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>
>
>
So, I can make educated guesses about why this is better for forwarding
workloads. I'd like to characterize it though. So, what's it doing that's
better? better locking? better caching behaviour? less memory lookups? etc.

Thanks,


-adrian

From owner-freebsd-arch@FreeBSD.ORG  Mon Sep 23 05:36:23 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 517B3981
 for <freebsd-arch@freebsd.org>; Mon, 23 Sep 2013 05:36:23 +0000 (UTC)
 (envelope-from juliakimball7@gmail.com)
Received: from mail-ob0-x250.google.com (mail-ob0-x250.google.com
 [IPv6:2607:f8b0:4003:c01::250])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 213C12F07
 for <freebsd-arch@freebsd.org>; Mon, 23 Sep 2013 05:36:23 +0000 (UTC)
Received: by mail-ob0-f208.google.com with SMTP id gq1so171492obb.3
 for <freebsd-arch@freebsd.org>; Sun, 22 Sep 2013 22:36:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:message-id:date:subject:from:to:content-type;
 bh=gFPO0F5LbsP6ilnyRBCuOdM8mbmSk5Ea+XmKx3i1U0U=;
 b=GrCqXdND3p4Vi0/wnvjb8G/fL2TIKYxbvqnZ4Xd708JeMg2b5aPCfigHrgFiZGQt+v
 fQkCslOizUdTPxWKUd9hQEomHdXDS+1pobOTgM0senamAKBoBqsvt7pRnXS1NCexItcQ
 2eeqHS4f/pYZE3aDYX7QGjR1P7deIZxALrXOiSNLSFR0lGfp/XjRk8HWrWJZfWxPEt5I
 OQm6SBbALWtcOKCZN5qUUfvTJey4T5Y77+r+YWBD9h6GYMtWJp7xquc6XK1ORlUPOSQr
 7zdNNDxVIGmMXtIbNMWsGtFKuDvzE0QCMWTYb6u1TLzadaHhaPb1lI2XvoGWVHV3I6/w
 zJeg==
MIME-Version: 1.0
X-Received: by 10.42.247.68 with SMTP id mb4mr6359559icb.14.1379914582507;
 Sun, 22 Sep 2013 22:36:22 -0700 (PDT)
Message-ID: <90e6ba1ef2ea82cd1e04e706668c@google.com>
Date: Mon, 23 Sep 2013 05:36:22 +0000
Subject: www.freebsd.org
From: Julia kimball <juliakimball7@gmail.com>
To: freebsd-arch@freebsd.org
Content-Type: text/plain; charset=windows-1252; format=flowed; delsp=yes
Content-Transfer-Encoding: base64
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2013 05:36:23 -0000

PGRpdiBkaXI9Imx0ciI+PGRpdiBkaXI9Imx0ciI+PGRpdiBkaXI9Imx0ciI+PGRpdiAgDQpkaXI9
Imx0ciI+PGRpdj48ZGl2PkhpLDxicj48YnI+PC9kaXY+PGRpdj5JIGNhbWUgYWNyb3NzIHlvdXIg
d2Vic2l0ZSBhbmQgIA0Kd2FudGVkIHRvDQogIHNlbmQgeW91IGEgcXVpY2sgbm90ZS4gV2l0aCBh
IGZldyBzaW1wbGUgY2hhbmdlcyB0byBtYWtlIHlvdXIgc2l0ZSBtb3JlDQogIFNFTy1mcmllbmRs
eSBJkm0gc3VyZSB5b3UgY2FuIGNvbnZlcnQgbW9yZSB2aXNpdG9ycyBpbnRvIGxlYWRzIGFuZCBn
ZXQNCml0IHBsYWNlZCBoaWdoZXIgaW4gdGhlIG9yZ2FuaWMgc2VhcmNoIHJlc3VsdHMsIGZvciBr
ZXl3b3JkcyB0aGF0IG1hdHRlcg0KICB0byB5b3UgdGhlIG1vc3QuPC9kaXY+DQo8ZGl2Pjxicj48
L2Rpdj48ZGl2PldlknJlIGFuIEF1c3RyYWxpYW4gYmFzZWQgY29tcGFueSB3aXRoIGEgZ3JlYXQN
CmluLWhvdXNlIHRlY2huaWNhbCB0ZWFtIHdobyByZWFsbHkga25vdyB0aGVpciBzdHVmZiBhYm91
dCBzZWFyY2ggZW5naW5lDQpvcHRpbWl6YXRpb24uoDwvZGl2PjxkaXY+PGJyPjwvZGl2PjxkaXY+
V291bGQNCiAgeW91IGxpa2UgYSBiaXQgbW9yZSBpbmZvcm1hdGlvbiBhYm91dCBob3cgdG8gZ2l2
ZSB5b3VyIHdlYnNpdGUgYSBib29zdA0Kd2l0aCBiZXR0ZXIgU0VPPyBQbGVhc2UgZmVlbCBmcmVl
IHRvIGRyb3AgbWUgYW4gZW1haWwgb3IgeW91IGNhbiBjb250YWN0DQogIG1lIHZpYSBvdXIgd2Vi
c2l0ZS48L2Rpdj4NCjxkaXY+PGJyPjwvZGl2PjxkaXY+QmVzdCBSZWdhcmRzPC9kaXY+PGRpdj5K
dWxpYSAgDQpLaW1iYWxsPGJyPjwvZGl2PjxkaXY+PGI+PGZvbnQgc2l6ZT0iMSI+PGEgIA0KaHJl
Zj0ibWFpbHRvOmFuZ2VsaWNhQGltbWFjdWxhdGVzZW8uY29tIiAgDQp0YXJnZXQ9Il9ibGFuayI+
c2VvQGltbWFjdWxhdGVzZW8uY29tPC9hPjxicj48L2ZvbnQ+PC9iPjwvZGl2Pg0KDQoNCjxkaXY+
DQo8Zm9udCBzaXplPSIxIj48Yj48aW1nIGFsdD0iSW5saW5lIGltYWdlIDIiIHNyYz0iY2lkOmlu
bGluZUltYWdlczAiPjxicj4NCg0KDQo8L2I+PC9mb250PjwvZGl2PjxkaXY+PGZvbnQgc2l6ZT0i
MSI+PGI+PHNwYW4gIA0Kc3R5bGU9ImNvbG9yOnJnYig2MSwxMzMsMTk4KSI+PGJyPkFVUyAgDQpI
ZWFkcXVhcnRlcjxicj48L3NwYW4+PC9iPjwvZm9udD48Zm9udCBzaXplPSIxIj48c3BhbiAgDQpz
dHlsZT0iY29sb3I6cmdiKDYxLDEzMywxOTgpIj48Zm9udCBzaXplPSIxIj48c3BhbiAgDQpzdHls
ZT0iY29sb3I6cmdiKDYxLDEzMywxOTgpIj5BdXN0cmFsaWFuIFRlY2hub2xvZ3kgUGFyaywgTG9j
b21vdGl2ZSAgDQpTdHJlZXQsIEV2ZWxlaWdoIDxicj4NCg0KDQoNCg0KDQpOU1cgMjAxNTwvc3Bh
bj48L2ZvbnQ+PGJyPjwvc3Bhbj48Yj48c3BhbiAgDQpzdHlsZT0iY29sb3I6cmdiKDYxLDEzMywx
OTgpIj48YnI+SW50ZXJuYXRpb25hbCAgDQpIZWFkcXVhcnRlcjxicj48L3NwYW4+PC9iPjwvZm9u
dD48Zm9udCBzaXplPSIxIj48c3BhbiAgDQpzdHlsZT0iY29sb3I6cmdiKDYxLDEzMywxOTgpIj48
Zm9udCBzaXplPSIxIj48c3BhbiAgDQpzdHlsZT0iY29sb3I6cmdiKDYxLDEzMywxOTgpIj41MDEg
MTl0aCBTdHJlZXQsIE4uVy4sIFdhc2hpbmd0b24sIEQuQy4gIA0KMjA0MzE8YnI+DQoNCg0KDQoN
Cg0KPC9zcGFuPjwvZm9udD48L3NwYW4+PC9mb250PjxkaXY+PGZvbnQgc2l6ZT0iMSI+PHNwYW4g
IA0Kc3R5bGU9ImNvbG9yOnJnYig2MSwxMzMsMTk4KSI+T2ZmaWNlIFBob25lOiAgDQoyMDYtMjAy
LTI5MDc8L3NwYW4+PC9mb250PjxiPjxmb250IHNpemU9IjEiPjxzcGFuICANCnN0eWxlPSJjb2xv
cjpyZ2IoNjEsMTMzLDE5OCkiPjxicj4NCg0KDQo8L3NwYW4+PC9mb250PjwvYj48L2Rpdj48ZGl2
Pjxmb250IHNpemU9IjEiPjxiPjxzcGFuICANCnN0eWxlPSJjb2xvcjpyZ2IoNjEsMTMzLDE5OCki
Pjwvc3Bhbj48c3BhbiAgDQpzdHlsZT0iY29sb3I6cmdiKDYxLDEzMywxOTgpIj48YnI+PC9zcGFu
PjwvYj48L2ZvbnQ+PC9kaXY+PGZvbnQgIA0Kc2l6ZT0iMSI+PGI+PHNwYW4gc3R5bGU9ImNvbG9y
OnJnYig2MSwxMzMsMTk4KSI+PC9zcGFuPjwvYj48L2ZvbnQ+PC9kaXY+DQoNCg0KDQoNCg0KPGRp
dj48Zm9udCBzaXplPSIxIj48YnI+DQoNCg0KPC9mb250PjwvZGl2PjwvZGl2PjwvZGl2Pg0KPC9k
aXY+DQo8L2Rpdj4NCjwvZGl2Pg0K
From owner-freebsd-arch@FreeBSD.ORG  Mon Sep 23 11:37:46 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 2200466A;
 Mon, 23 Sep 2013 11:37:46 +0000 (UTC) (envelope-from slw@zxy.spb.ru)
Received: from zxy.spb.ru (zxy.spb.ru [195.70.199.98])
 by mx1.freebsd.org (Postfix) with ESMTP id D0D8624F2;
 Mon, 23 Sep 2013 11:37:45 +0000 (UTC)
Received: from slw by zxy.spb.ru with local (Exim 4.69 (FreeBSD))
 (envelope-from <slw@zxy.spb.ru>)
 id 1VO4UW-0001lI-DE; Mon, 23 Sep 2013 15:39:48 +0400
Date: Mon, 23 Sep 2013 15:39:48 +0400
From: Slawa Olhovchenkov <slw@zxy.spb.ru>
To: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
Subject: Re: Network stack changes
Message-ID: <20130923113948.GA5647@zxy.spb.ru>
References: <521E41CB.30700@yandex-team.ru> <521E78B0.6080709@freebsd.org>
 <523F4BED.6000804@yandex-team.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <523F4BED.6000804@yandex-team.ru>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: slw@zxy.spb.ru
X-SA-Exim-Scanned: No (on zxy.spb.ru); SAEximRunCond expanded to false
Cc: adrian@freebsd.org, Andre Oppermann <andre@freebsd.org>,
 freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org, luigi@freebsd.org,
 ae@FreeBSD.org, Gleb Smirnoff <glebius@FreeBSD.org>,
 FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2013 11:37:46 -0000

On Sun, Sep 22, 2013 at 11:58:37PM +0400, Alexander V. Chernikov wrote:

> I've found the paper I was talking about:
> http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf
> 
> It claims that our radix is able to do 6MPPS/core and it does not scale 
> with number of cores.

Our radix is bugly and don't work corretly.

From owner-freebsd-arch@FreeBSD.ORG  Mon Sep 23 22:46:47 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 8EB6AE72;
 Mon, 23 Sep 2013 22:46:47 +0000 (UTC)
 (envelope-from sodynet1@gmail.com)
Received: from mail-pd0-x22a.google.com (mail-pd0-x22a.google.com
 [IPv6:2607:f8b0:400e:c02::22a])
 (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 4269C2F8F;
 Mon, 23 Sep 2013 22:46:47 +0000 (UTC)
Received: by mail-pd0-f170.google.com with SMTP id x10so3809582pdj.29
 for <multiple recipients>; Mon, 23 Sep 2013 15:46:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
 :cc:content-type;
 bh=UjilBwU9OcaYlukiK4g1agmh0oG9lXC8IZz6YH7pBlg=;
 b=zNd2nK4fBP8S1qNkzhLLiAaFq3C4x9cs2l/LTZGD8uliIScxephPgtW+7SCmsbDgyP
 zPOwbSas1L4JqJYHFeXl+xDyHGgcKGQxW90RfR95vFmZzoqsZPL6IkW4nE98w++cfJsZ
 +e6CJwB7dPIYAg0V1byVyhshHzsVmidAdwxDXn3Aij1tX5RjR+b7zexAkbHtjh1aDB/L
 FafyXOjh/rDms1JVnjmCUv9t2V6LedW+gx722WWwMGk7TiOi77IdH6DE5B9lYEfwZ4g7
 trkHotGxZ1WWlEq6SF2TtwsgUIzr8ghc2hO73/neyiPvynvGoPaLlM/NHjuRcQjZss2Q
 lv5A==
MIME-Version: 1.0
X-Received: by 10.68.11.103 with SMTP id p7mr3431565pbb.84.1379976406783; Mon,
 23 Sep 2013 15:46:46 -0700 (PDT)
Received: by 10.70.30.98 with HTTP; Mon, 23 Sep 2013 15:46:46 -0700 (PDT)
In-Reply-To: <523F4F14.9090404@yandex-team.ru>
References: <521E41CB.30700@yandex-team.ru>
 <CAJ-Vmo=N=HnZVCD41ZmDg2GwNnoa-tD0J0QLH80x=f7KA5d+Ug@mail.gmail.com>
 <523F4F14.9090404@yandex-team.ru>
Date: Tue, 24 Sep 2013 01:46:46 +0300
Message-ID: <CAEW+ogZttyScUBQQWht+YGfLEDU_APcoRyYeMy_wDseAcZwVnA@mail.gmail.com>
Subject: Re: Network stack changes
From: Sami Halabi <sodynet1@gmail.com>
To: "Alexander V. Chernikov" <melifaro@yandex-team.ru>
Content-Type: text/plain; charset=ISO-8859-1
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
Cc: Adrian Chadd <adrian@freebsd.org>, Andre Oppermann <andre@freebsd.org>,
 "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>,
 Luigi Rizzo <luigi@freebsd.org>, "Andrey V. Elsukov" <ae@freebsd.org>,
 FreeBSD Net <net@freebsd.org>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2013 22:46:47 -0000

Hi,
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>
I've tried the diff in 10-current, applied cleanly but had errors compiling
new kernel... is there any work to make it work? i'd love to test it.

Sami


On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
melifaro@yandex-team.ru> wrote:

> On 29.08.2013 15:49, Adrian Chadd wrote:
>
>> Hi,
>>
> Hello Adrian!
> I'm very sorry for the looong reply.
>
>
>
>> There's a lot of good stuff to review here, thanks!
>>
>> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
>> keep locking things like that on a per-packet basis. We should be able to
>> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and
>> convert the interrupt handler to a fast handler that just schedules that
>> taskqueue. We can ignore the ithread entirely here.
>>
>> What do you think?
>>
> Well, it sounds good :) But performance numbers and Jack opinion is more
> important :)
>
> Are you going to Malta?
>
>
>> Totally pie in the sky handwaving at this point:
>>
>> * create an array of mbuf pointers for completed mbufs;
>> * populate the mbuf array;
>> * pass the array up to ether_demux().
>>
>> For vlan handling, it may end up populating its own list of mbufs to push
>> up to ether_demux(). So maybe we should extend the API to have a bitmap of
>> packets to actually handle from the array, so we can pass up a larger array
>> of mbufs, note which ones are for the destination and then the upcall can
>> mark which frames its consumed.
>>
>> I specifically wonder how much work/benefit we may see by doing:
>>
>> * batching packets into lists so various steps can batch process things
>> rather than run to completion;
>> * batching the processing of a list of frames under a single lock
>> instance - eg, if the forwarding code could do the forwarding lookup for
>> 'n' packets under a single lock, then pass that list of frames up to
>> inet_pfil_hook() to do the work under one lock, etc, etc.
>>
> I'm thinking the same way, but we're stuck with 'forwarding lookup' due to
> problem with egress interface pointer, as I mention earlier. However it is
> interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>
>
>
>
>
>> Here, the processing would look less like "grab lock and process to
>> completion" and more like "mark and sweep" - ie, we have a list of frames
>> that we mark as needing processing and mark as having been processed at
>> each layer, so we know where to next dispatch them.
>>
>> I still have some tool coding to do with PMC before I even think about
>> tinkering with this as I'd like to measure stuff like per-packet latency as
>> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P /
>> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.)
>>
> That will be great to see!
>
>>
>> Thanks,
>>
>>
>>
>> -adrian
>>
>>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>


-- 
Sami Halabi
Information Systems Engineer
NMS Projects Expert
FreeBSD SysAdmin Expert


On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
melifaro@yandex-team.ru> wrote:

> On 29.08.2013 15:49, Adrian Chadd wrote:
>
>> Hi,
>>
> Hello Adrian!
> I'm very sorry for the looong reply.
>
>
>
>> There's a lot of good stuff to review here, thanks!
>>
>> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
>> keep locking things like that on a per-packet basis. We should be able to
>> do this in a cleaner way - we can defer RX into a CPU pinned taskqueue and
>> convert the interrupt handler to a fast handler that just schedules that
>> taskqueue. We can ignore the ithread entirely here.
>>
>> What do you think?
>>
> Well, it sounds good :) But performance numbers and Jack opinion is more
> important :)
>
> Are you going to Malta?
>
>
>> Totally pie in the sky handwaving at this point:
>>
>> * create an array of mbuf pointers for completed mbufs;
>> * populate the mbuf array;
>> * pass the array up to ether_demux().
>>
>> For vlan handling, it may end up populating its own list of mbufs to push
>> up to ether_demux(). So maybe we should extend the API to have a bitmap of
>> packets to actually handle from the array, so we can pass up a larger array
>> of mbufs, note which ones are for the destination and then the upcall can
>> mark which frames its consumed.
>>
>> I specifically wonder how much work/benefit we may see by doing:
>>
>> * batching packets into lists so various steps can batch process things
>> rather than run to completion;
>> * batching the processing of a list of frames under a single lock
>> instance - eg, if the forwarding code could do the forwarding lookup for
>> 'n' packets under a single lock, then pass that list of frames up to
>> inet_pfil_hook() to do the work under one lock, etc, etc.
>>
> I'm thinking the same way, but we're stuck with 'forwarding lookup' due to
> problem with egress interface pointer, as I mention earlier. However it is
> interesting to see how much it helps, regardless of locking.
>
> Currently I'm thinking that we should try to change radix to something
> different (it seems that it can be checked fast) and see what happened.
> Luigi's performance numbers for our radix are too awful, and there is a
> patch implementing alternative trie:
> http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf>
> http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff>
>
>
>
>
>> Here, the processing would look less like "grab lock and process to
>> completion" and more like "mark and sweep" - ie, we have a list of frames
>> that we mark as needing processing and mark as having been processed at
>> each layer, so we know where to next dispatch them.
>>
>> I still have some tool coding to do with PMC before I even think about
>> tinkering with this as I'd like to measure stuff like per-packet latency as
>> well as top-level processing overhead (ie, CPU_CLK_UNHALTED.THREAD_P /
>> lagg0 TX bytes/pkts, RX bytes/pkts, NIC interrupts on that core, etc.)
>>
> That will be great to see!
>
>>
>> Thanks,
>>
>>
>>
>> -adrian
>>
>>
> ______________________________**_________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.freebsd.org/mailman/listinfo/freebsd-net>
> To unsubscribe, send any mail to "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.org>
> "
>


-- 
Sami Halabi
Information Systems Engineer
NMS Projects Expert
FreeBSD SysAdmin Expert

From owner-freebsd-arch@FreeBSD.ORG  Tue Sep 24 07:58:31 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 2E99BDED;
 Tue, 24 Sep 2013 07:58:31 +0000 (UTC) (envelope-from zec@fer.hr)
Received: from mail.fer.hr (mail.fer.hr [161.53.72.233])
 (using TLSv1 with cipher AES128-SHA (128/128 bits))
 (No client certificate requested)
 by mx1.freebsd.org (Postfix) with ESMTPS id 5411428EC;
 Tue, 24 Sep 2013 07:58:29 +0000 (UTC)
Received: from dyn10.nxlab.fer.hr (161.53.63.210) by MAIL.fer.hr
 (161.53.72.233) with Microsoft SMTP Server (TLS) id 14.2.342.3; Tue, 24 Sep
 2013 09:57:14 +0200
From: Marko Zec <zec@fer.hr>
To: <freebsd-hackers@freebsd.org>
Subject: Re: Network stack changes
Date: Tue, 24 Sep 2013 09:58:05 +0200
User-Agent: KMail/1.9.10
References: <521E41CB.30700@yandex-team.ru> <523F4F14.9090404@yandex-team.ru>
 <CAEW+ogZttyScUBQQWht+YGfLEDU_APcoRyYeMy_wDseAcZwVnA@mail.gmail.com>
In-Reply-To: <CAEW+ogZttyScUBQQWht+YGfLEDU_APcoRyYeMy_wDseAcZwVnA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-ID: <201309240958.06172.zec@fer.hr>
X-Originating-IP: [161.53.63.210]
Cc: "Alexander V. Chernikov" <melifaro@yandex-team.ru>,
 Adrian Chadd <adrian@freebsd.org>, Andre Oppermann <andre@freebsd.org>,
 FreeBSD Net <net@freebsd.org>, Luigi Rizzo <luigi@freebsd.org>,
 "Andrey V. Elsukov" <ae@freebsd.org>,
 "freebsd-arch@freebsd.org" <freebsd-arch@freebsd.org>,
 Sami Halabi <sodynet1@gmail.com>
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Sep 2013 07:58:31 -0000

On Tuesday 24 September 2013 00:46:46 Sami Halabi wrote:
> Hi,
>
> > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.i
> >et.unipi.it/~luigi/papers/20120601-dxr.pdf>
> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.f
> >er.hr/dxr/stable_8_20120824.diff>
>
> I've tried the diff in 10-current, applied cleanly but had errors
> compiling new kernel... is there any work to make it work? i'd love to
> test it.

Even if you'd make it compile on current, you could only run synthetic tests 
measuring lookup performance using streams of random keys, as outlined in 
the paper (btw. the paper at Luigi's site is an older draft, the final 
version with slightly revised benchmarks is available here:
http://www.sigcomm.org/sites/default/files/ccr/papers/2012/October/2378956-2378961.pdf)

I.e. the code only hooks into the routing API for testing purposes, but is 
completely disconnected from the forwarding path.

We have a prototype in the works which combines DXR with Netmap in userspace 
and is capable of sustaining well above line rate forwarding with 
full-sized BGP views using Intel 10G cards on commodity multicore machines.  
The work was somewhat stalled during the summer but I plan to wrap it up 
and release the code until the end of this year.  With recent advances in 
netmap it might also be feasible to merge DXR and netmap entirely inside 
the kernel but I've not explored that path yet...

Marko


> Sami
>
>
> On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
>
> melifaro@yandex-team.ru> wrote:
> > On 29.08.2013 15:49, Adrian Chadd wrote:
> >> Hi,
> >
> > Hello Adrian!
> > I'm very sorry for the looong reply.
> >
> >> There's a lot of good stuff to review here, thanks!
> >>
> >> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
> >> keep locking things like that on a per-packet basis. We should be able
> >> to do this in a cleaner way - we can defer RX into a CPU pinned
> >> taskqueue and convert the interrupt handler to a fast handler that
> >> just schedules that taskqueue. We can ignore the ithread entirely
> >> here.
> >>
> >> What do you think?
> >
> > Well, it sounds good :) But performance numbers and Jack opinion is
> > more important :)
> >
> > Are you going to Malta?
> >
> >> Totally pie in the sky handwaving at this point:
> >>
> >> * create an array of mbuf pointers for completed mbufs;
> >> * populate the mbuf array;
> >> * pass the array up to ether_demux().
> >>
> >> For vlan handling, it may end up populating its own list of mbufs to
> >> push up to ether_demux(). So maybe we should extend the API to have a
> >> bitmap of packets to actually handle from the array, so we can pass up
> >> a larger array of mbufs, note which ones are for the destination and
> >> then the upcall can mark which frames its consumed.
> >>
> >> I specifically wonder how much work/benefit we may see by doing:
> >>
> >> * batching packets into lists so various steps can batch process
> >> things rather than run to completion;
> >> * batching the processing of a list of frames under a single lock
> >> instance - eg, if the forwarding code could do the forwarding lookup
> >> for 'n' packets under a single lock, then pass that list of frames up
> >> to inet_pfil_hook() to do the work under one lock, etc, etc.
> >
> > I'm thinking the same way, but we're stuck with 'forwarding lookup' due
> > to problem with egress interface pointer, as I mention earlier. However
> > it is interesting to see how much it helps, regardless of locking.
> >
> > Currently I'm thinking that we should try to change radix to something
> > different (it seems that it can be checked fast) and see what happened.
> > Luigi's performance numbers for our radix are too awful, and there is a
> > patch implementing alternative trie:
> > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.i
> >et.unipi.it/~luigi/papers/20120601-dxr.pdf>
> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.f
> >er.hr/dxr/stable_8_20120824.diff>
> >
> >> Here, the processing would look less like "grab lock and process to
> >> completion" and more like "mark and sweep" - ie, we have a list of
> >> frames that we mark as needing processing and mark as having been
> >> processed at each layer, so we know where to next dispatch them.
> >>
> >> I still have some tool coding to do with PMC before I even think about
> >> tinkering with this as I'd like to measure stuff like per-packet
> >> latency as well as top-level processing overhead (ie,
> >> CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC
> >> interrupts on that core, etc.)
> >
> > That will be great to see!
> >
> >> Thanks,
> >>
> >>
> >>
> >> -adrian
> >
> > ______________________________**_________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.fr
> >eebsd.org/mailman/listinfo/freebsd-net> To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.
> >org> "
>
> --
> Sami Halabi
> Information Systems Engineer
> NMS Projects Expert
> FreeBSD SysAdmin Expert
>
>
> On Sun, Sep 22, 2013 at 11:12 PM, Alexander V. Chernikov <
>
> melifaro@yandex-team.ru> wrote:
> > On 29.08.2013 15:49, Adrian Chadd wrote:
> >> Hi,
> >
> > Hello Adrian!
> > I'm very sorry for the looong reply.
> >
> >> There's a lot of good stuff to review here, thanks!
> >>
> >> Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to
> >> keep locking things like that on a per-packet basis. We should be able
> >> to do this in a cleaner way - we can defer RX into a CPU pinned
> >> taskqueue and convert the interrupt handler to a fast handler that
> >> just schedules that taskqueue. We can ignore the ithread entirely
> >> here.
> >>
> >> What do you think?
> >
> > Well, it sounds good :) But performance numbers and Jack opinion is
> > more important :)
> >
> > Are you going to Malta?
> >
> >> Totally pie in the sky handwaving at this point:
> >>
> >> * create an array of mbuf pointers for completed mbufs;
> >> * populate the mbuf array;
> >> * pass the array up to ether_demux().
> >>
> >> For vlan handling, it may end up populating its own list of mbufs to
> >> push up to ether_demux(). So maybe we should extend the API to have a
> >> bitmap of packets to actually handle from the array, so we can pass up
> >> a larger array of mbufs, note which ones are for the destination and
> >> then the upcall can mark which frames its consumed.
> >>
> >> I specifically wonder how much work/benefit we may see by doing:
> >>
> >> * batching packets into lists so various steps can batch process
> >> things rather than run to completion;
> >> * batching the processing of a list of frames under a single lock
> >> instance - eg, if the forwarding code could do the forwarding lookup
> >> for 'n' packets under a single lock, then pass that list of frames up
> >> to inet_pfil_hook() to do the work under one lock, etc, etc.
> >
> > I'm thinking the same way, but we're stuck with 'forwarding lookup' due
> > to problem with egress interface pointer, as I mention earlier. However
> > it is interesting to see how much it helps, regardless of locking.
> >
> > Currently I'm thinking that we should try to change radix to something
> > different (it seems that it can be checked fast) and see what happened.
> > Luigi's performance numbers for our radix are too awful, and there is a
> > patch implementing alternative trie:
> > http://info.iet.unipi.it/~**luigi/papers/20120601-dxr.pdf<http://info.i
> >et.unipi.it/~luigi/papers/20120601-dxr.pdf>
> > http://www.nxlab.fer.hr/dxr/**stable_8_20120824.diff<http://www.nxlab.f
> >er.hr/dxr/stable_8_20120824.diff>
> >
> >> Here, the processing would look less like "grab lock and process to
> >> completion" and more like "mark and sweep" - ie, we have a list of
> >> frames that we mark as needing processing and mark as having been
> >> processed at each layer, so we know where to next dispatch them.
> >>
> >> I still have some tool coding to do with PMC before I even think about
> >> tinkering with this as I'd like to measure stuff like per-packet
> >> latency as well as top-level processing overhead (ie,
> >> CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC
> >> interrupts on that core, etc.)
> >
> > That will be great to see!
> >
> >> Thanks,
> >>
> >>
> >>
> >> -adrian
> >
> > ______________________________**_________________
> > freebsd-net@freebsd.org mailing list
> > http://lists.freebsd.org/**mailman/listinfo/freebsd-net<http://lists.fr
> >eebsd.org/mailman/listinfo/freebsd-net> To unsubscribe, send any mail to
> > "freebsd-net-unsubscribe@**freebsd.org<freebsd-net-unsubscribe@freebsd.
> >org> "


From owner-freebsd-arch@FreeBSD.ORG  Fri Sep 27 00:19:26 2013
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTP id 84210FE3
 for <freebsd-arch@freebsd.org>; Fri, 27 Sep 2013 00:19:26 +0000 (UTC)
 (envelope-from leishen2345@sina.com)
Received: from smtp2911-210.mail.sina.com.cn (mail2-156.sinamail.sina.com.cn
 [60.28.2.156]) by mx1.freebsd.org (Postfix) with SMTP id 66989242B
 for <freebsd-arch@freebsd.org>; Fri, 27 Sep 2013 00:19:17 +0000 (UTC)
Received: from unknown( HELO webmail.sinamail.sina.com.cn)([172.16.201.22])
 by sina.com with SMTP 27 Sep 2013 08:19:16 +0800 (CST)
X-Sender: leishen2345@sina.com
X-SMAIL-MID: 7708411294605
Received: by webmail.sinamail.sina.com.cn (Postfix, from userid 80)
 id 13280D00003; Fri, 27 Sep 2013 08:19:16 +0800 (CST)
Date: Fri, 27 Sep 2013 08:19:15 +0800 
Received: from leishen2345@sina.com([221.217.30.28]) by m0.mail.sina.com.cn
 via HTTP; Fri, 27 Sep 2013 08:19:15 +0800 (CST)
From: <leishen2345@sina.com>
To: "freebsd-arch" <freebsd-arch@FreeBSD.org>
Subject: clothes, wigs, accessories and coffee
MIME-Version: 1.0
X-Priority: 3
X-MessageID: 1380241155.9106.24251
X-Originating-IP: [172.16.201.22]
X-Mailer: Sina WebMail 4.0
X-Sina-Sendseparate: 1
Message-Id: <20130927001916.13280D00003@webmail.sinamail.sina.com.cn>
Content-Type: text/plain;
	charset=UTF-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
X-Content-Filtered-By: Mailman/MimeDel 2.1.14
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.14
Precedence: list
Reply-To: leishen2345@sina.com
List-Id: Discussion related to FreeBSD architecture <freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
 <mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 27 Sep 2013 00:19:26 -0000

IAoNCgoNCg0KLS0tLS0gT3JpZ2luYWwgTWVzc2FnZSAtLS0tLQpGcm9tOiA8bGVpc2hlbjIzNDVA
c2luYS5jb20+ClRvOiAidmFndmFtcGlyZSIgPHZhZ3ZhbXBpcmVAeWFuZGV4LnJ1PiwgImxhdGlu
IiA8bGF0aW5AaWNzLnRyYT4sICJlZmkiIDxlZmlAaWNzLnRyYT4sICJla3IxIiA8ZWtyMUBpY3Mu
dHJhPiwgImluZm8iIDxpbmZvQHBjLWVybGVybmVuLmRlPiwgInByZXNzIiA8cHJlc3NAb3BlbnNv
dXJjZS5vcmc+LCAiaW5mbyIgPGluZm9AbmV0c2NhcGUuY29tPiwgIndpcmUtc2VydmljZSIgPHdp
cmUtc2VydmljZUBvcGVuc291cmNlLm9yZz4sICJlc3IiIDxlc3JAdGh5cnN1cy5jb20+LCAid2Vi
bWFzdGVyIiA8d2VibWFzdGVyQG9wZW5zb3VyY2Uub3JnPiwgImVzciIgPGVzckBzbmFyay50aHly
c3VzLmNvbT4sICJzaG9sZGVuIiA8c2hvbGRlbkB0aGVvcGVuYmFzdGlvbi5jb20+LCAid2VibWFz
dGVyIiA8d2VibWFzdGVyQGRlYmlhbi5vcmc+LCAib3NpIiA8b3NpQG9wZW5zb3VyY2Uub3JnPiwg
ImluZm8iIDxpbmZvQHRoZW9wZW5iYXN0aW9uLmNvbT4sICJwZG0iIDxwZG1AYWZ1bC5vcmc+LCAi
aW5mbyIgPGluZm9Ab3N1b3NsLm9yZz4sICJsaWNlbnNlLXJldmlldyIgPGxpY2Vuc2UtcmV2aWV3
QG9wZW5zb3VyY2Uub3JnPiwgImxpY2Vuc2UtZGlzY3VzcyIgPGxpY2Vuc2UtZGlzY3Vzc0BvcGVu
c291cmNlLm9yZz4sICJvc2ktZWR1LWRpc2N1c3MiIDxvc2ktZWR1LWRpc2N1c3NAbWVtYmVycy5v
cGVuc291cmNlPiwgImluZm8iIDxpbmZvQGFydG9mbW9kZXJucm9jay5jb20+LCAiaW5mbyIgPGlu
Zm9AZGllc2VsZnVlbHByaW50cy5jbz4sICJnaXZpbmciIDxnaXZpbmdAd2lraW1lZGlhLm9yZz4s
ICJmcmVlYnNkLWRvYyIgPGZyZWVic2QtZG9jQEZyZWVCU0Qub3JnPiwKU3ViamVjdDogY2xvdGhl
cywgd2lncywgYWNjZXNzb3JpZXMgYW5kIGNvZmZlZQpEYXRlOiAyMDEzLTA5LTI3IDA4OjE1Cg0K
Cg0KIAoNCgoNCg0KLS0tLS0gT3JpZ2luYWwgTWVzc2FnZSAtLS0tLQpGcm9tOiA8bGVpc2hlbjIz
NDVAc2luYS5jb20+ClRvOiAiZG9uYXRlIiA8ZG9uYXRlQG9wZW5jYXJ0LmNvbT4sICJ2YWd2YW1w
aXJlIiA8dmFndmFtcGlyZUB5YW5kZXgucnU+LCAiZG9taW5pY2FuYSIgPGRvbWluaWNhbmFAaWNz
LnRyYT4sICJsYXRpbiIgPGxhdGluQGljcy50cmE+LCAiZWZpIiA8ZWZpQGljcy50cmE+LCAiZWty
MSIgPGVrcjFAaWNzLnRyYT4sICJpbmZvIiA8aW5mb0BwYy1lcmxlcm5lbi5kZT4sICJwcmVzcyIg
PHByZXNzQG9wZW5zb3VyY2Uub3JnPiwgImluZm8iIDxpbmZvQG5ldHNjYXBlLmNvbT4sICJ3aXJl
LXNlcnZpY2UiIDx3aXJlLXNlcnZpY2VAb3BlbnNvdXJjZS5vcmc+LCAiZXNyIiA8ZXNyQHRoeXJz
dXMuY29tPiwgImVzciIgPGVzckBzbmFyay50aHlyc3VzLmNvbT4sICJ3ZWJtYXN0ZXIiIDx3ZWJt
YXN0ZXJAb3BlbnNvdXJjZS5vcmc+LCAiZGViYnJ5YW50IiA8ZGViYnJ5YW50QG9wZW5zb3VyY2Uu
b3JnPiwgInNob2xkZW4iIDxzaG9sZGVuQHRoZW9wZW5iYXN0aW9uLmNvbT4sICJ3ZWJtYXN0ZXIi
IDx3ZWJtYXN0ZXJAZGViaWFuLm9yZz4sICJkZWJpYW4td3d3IiA8ZGViaWFuLXd3d0BsaXN0cy5k
ZWJpYW4ub3JnPiwgIm9zaSIgPG9zaUBvcGVuc291cmNlLm9yZz4sICJpbmZvIiA8aW5mb0B0aGVv
cGVuYmFzdGlvbi5jb20+LCAicGRtIiA8cGRtQGFmdWwub3JnPiwKU3ViamVjdDogRnc6IGNsb3Ro
ZXMsIHdpZ3MsIGFjY2Vzc29yaWVzIGFuZCBjb2ZmZWUKRGF0ZTogMjAxMy0wOS0yNyAwODoxMAoN
CgoNCiAKDQoKDQoNCi0tLS0tIE9yaWdpbmFsIE1lc3NhZ2UgLS0tLS0KRnJvbTogPGxlaXNoZW4y
MzQ1QHNpbmEuY29tPgpUbzogIjI2Njk1OTQyIiA8MjY2OTU5NDJATjAwLmpwZz4sICI0NDEyNDQz
MTg0NyIgPDQ0MTI0NDMxODQ3QE4wMS5qcGc+LCAiNjY4OTcyODYiIDw2Njg5NzI4NkBOMDAuanBn
PiwgIjc1NTYyMDQiIDw3NTU2MjA0QE4wNS5qcGc+LCAiTGFzdFRvdWNoIiA8TGFzdFRvdWNoQGhh
Y2tlcm1haWwuY29tPiwgImFjaGUiIDxhY2hlQEZyZWVCU0Qub3JnPiwgImFjbSIgPGFjbUBGcmVl
QlNELm9yZz4sICJhZHJpYW4iIDxhZHJpYW5ARnJlZUJTRC5vcmc+LCAiYWUiIDxhZUBGcmVlQlNE
Lm9yZz4sICJhaHplIiA8YWh6ZUBGcmVlQlNELm9yZz4sICJha2l5YW1hIiA8YWtpeWFtYUBGcmVl
QlNELm9yZz4sICJhbGMiIDxhbGNARnJlZUJTRC5vcmc+LCAiYWxlIiA8YWxlQEZyZWVCU0Qub3Jn
PiwgImFtYnJpc2tvIiA8YW1icmlza29ARnJlZUJTRC5vcmc+LCAiYW50b2luZSIgPGFudG9pbmVA
RnJlZUJTRC5vcmc+LCAiYXJhdWpvIiA8YXJhdWpvQEZyZWVCU0Qub3JnPiwgImFydmVkIiA8YXJ2
ZWRARnJlZUJTRC5vcmc+LCAiYXZnIiA8YXZnQEZyZWVCU0Qub3JnPiwgImF6diIgPGF6dkBpY3Mu
dHJhPiwKU3ViamVjdDogY2xvdGhlcywgd2lncywgYWNjZXNzb3JpZXMgYW5kIGNvZmZlZQpEYXRl
OiAyMDEzLTA5LTI3IDA4OjA3Cg0KCg0KaHR0cDovL3d3dy5hbGlleHByZXNzLmNvbS9pdGVtLWlt
Zy9PbGQtdG93bi1vbGR0b3duMy0xLWhvcnNlLXBsYW50LWluc3RhbnQtd2hpdGUtY29mZmVlLXN1
Z2FyLWZyZWUvMTMyNDk5NzE5OS5odG1sDQogDQogDQogDQpEZWFyIGZyaWVuZCEgSSBhbSBhIG5l
dw0KaW50ZXJuYXRpb25hbCBzdG9yZSAoRnJlZSBjbHViKSBsb2NhdGVkIGluIENoYW5ncGluZyBk
aXN0cmljdCBvZiBCZWlqaW5nIHdoZXJlDQpjYXJnbyB0cmFuc3BvcnRhdGlvbiBpcyBjb252ZW5p
ZW50Lg0KIA0KT3VyIGhhdmUgcGxlbnR5IG9mDQpwcm9kdWN0cyBzdWNoIGFzIGNsb3RoZXMsIHdp
Z3MsIGFjY2Vzc29yaWVzIGFuZCBjb2ZmZWUuIFRoZSBwcm9kdWN0c+KAmSBxdWFsaXR5DQphcmUg
YXNzdXJlZC4NCiANCkJlY2F1c2UgdGhlIHN0b3JlIGhhcyBiZWVuIGZvdW5kZWQgcmVjZW50bHks
IG91cg0KcHJvZHVjdHMgd2lsbCBoYXZlIGEgbG93ZXIgZGlzY291bnQuIA0KTWF5YmUgd2UgY2Fu
IHByb3ZpZGUgc21hbGwgZ2lmdHMgdG8gc2hvdyBvdXINCnNpbmNlcml0eS4NClRvIHNhdmUgeW91
ciBtb25leSwgd2Ugd2lsbCBjaG9vc2UgU21hbGwgcGFja2FnZSBmb3INCmFpciBwYXJjZWwgYXMg
dGhlIGZpcnN0IHNoaXBwaW5nIHdheS4NCjEuIENvbnRhY3QgdXMNCkEuICAgICBQbGVhc2UgY2xp
Y2sgdGhlIGxpbmsgYmVsb3cgKCB0aGUgRnJlZSBjbHViKQ0KICAgaHR0cDovL3d3dy5hbGlleHBy
ZXNzLmNvbS9zdG9yZS8yMzY1NzINCkIuICAgICBQbGVhc2Ugc2VuZCBsZXR0ZXJzIHRvIHVzLCB0
aGUgZW1haWwgYWRkcmVzczoNCnN1bWFpdG9uZ2Jlc3RAaG90bWFpbC5jb20NCkMuICAgICBQbGVh
c2UgY29udGFjdCB3aXRoIHVzIGJ5IHRoZSBUcmFkZU1hbmFnZXINCm9ubGluZQ0KICAgVHJhZGVN
YW5hZ2VyIG51bWJlcjogY24xNTEwMDQ4OTEzDQogDQoyLiBUaGUgbWV0aG9kIG9mIHBheW1lbnQN
CldlIHdpbGwgdmVyeSBhcHByZWNpYXRlIGl0IGlmIHlvdSBjaG9zZSBXZXN0IFVuaW9uLg0KT3Vy
IGFjY291bnQgaW5mb3JtYXRpb24gaXMgYXMgYmVsb3cgLCBwbGVhc2UNCnRyYW5zZmVyIHRoZSBw
YXltZW50IHRvOg0KIA0KRmlyc3QgbmFtZe+8mldBTkcgDQpDaXR577yaQ0hJTkENCiANCkRlYXIg
Y3VzdG9tZXIsIGlmIHlvdQ0KYXJlIHdpbGxpbmcgdG8gYmUgb3VyIHBhcnRuZXIsIHdlIHdpbGwg
bGV0IHlvdSBmZWVsIHNhdGlzZmllZCBieSBvdXIgcHJhY3RpY2FsDQphY3Rpb24uDQpJZiB5b3Ug
aGF2ZSBhbnkNCnF1ZXN0aW9ucywgcGxlYXNlIHJlcGx5IHRvIHVzIQ0KSSB3aXNoIHlvdSBhIGhh
cHB5IGxpZmUuDQogDQpXZSBjYW4gYmUgeW91ciBUYW9iYW8NCmFnZW50IHRvbyENCiANCiANCmh0
dHA6Ly93d3cudGFvYmFvLmNvbS8/c3BtPTEuMTAwMDM4Ni5hMzEyNS4xLnhxWHl5OA0KIA0KIA0K
RnJlZSBDbHViIHNlcnZpY2UNCiANCiANCiANCg0KDQo=