From owner-freebsd-hackers@FreeBSD.ORG Sun Sep 22 20:13:11 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 956362E0; Sun, 22 Sep 2013 20:13:11 +0000 (UTC) (envelope-from melifaro@yandex-team.ru) Received: from forward-corp1f.mail.yandex.net (forward-corp1f.mail.yandex.net [IPv6:2a02:6b8:0:801::10]) by mx1.freebsd.org (Postfix) with ESMTP id 0AEAD23F6; Sun, 22 Sep 2013 20:13:11 +0000 (UTC) Received: from smtpcorp4.mail.yandex.net (smtpcorp4.mail.yandex.net [95.108.252.2]) by forward-corp1f.mail.yandex.net (Yandex) with ESMTP id 9743C2420022; Mon, 23 Sep 2013 00:13:07 +0400 (MSK) Received: from smtpcorp4.mail.yandex.net (localhost [127.0.0.1]) by smtpcorp4.mail.yandex.net (Yandex) with ESMTP id CB2EB2C032B; Mon, 23 Sep 2013 00:13:06 +0400 (MSK) Received: from dhcp170-36-red.yandex.net (dhcp170-36-red.yandex.net [95.108.170.36]) by smtpcorp4.mail.yandex.net (nwsmtp/Yandex) with ESMTP id zPyEtHgBff-D6iqY3ni; Mon, 23 Sep 2013 00:13:06 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1379880786; bh=lc1BwebjnC7LP0vJSar/INsu9fAdxzJlXjhZRROe3wc=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:Content-Type:Content-Transfer-Encoding; b=TiFhbX28OlyJCId0ZK+VwWQp36YyQ+h5WJh3zgC8KN+0g0HD9q83tsYSnh8yv3F8L xX3AllLMhINn8mDOk4NZSsUIRGpnNhckim3gXJVni//gJQRXbaQZLY79Q45TEGh/pn 8RzHV1D9iadYLhu84SNJrkjRmQNZV1G2ug+v8DQo= Authentication-Results: smtpcorp4.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Message-ID: <523F4F14.9090404@yandex-team.ru> Date: Mon, 23 Sep 2013 00:12:04 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130824 Thunderbird/17.0.8 MIME-Version: 1.0 To: Adrian Chadd Subject: Re: Network stack changes References: <521E41CB.30700@yandex-team.ru> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Sun, 22 Sep 2013 22:25:15 +0000 Cc: Luigi Rizzo , Andre Oppermann , "freebsd-hackers@freebsd.org" , FreeBSD Net , "Andrey V. Elsukov" , Gleb Smirnoff , "freebsd-arch@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Sep 2013 20:13:11 -0000 On 29.08.2013 15:49, Adrian Chadd wrote: > Hi, Hello Adrian! I'm very sorry for the looong reply. > > There's a lot of good stuff to review here, thanks! > > Yes, the ixgbe RX lock needs to die in a fire. It's kinda pointless to > keep locking things like that on a per-packet basis. We should be able > to do this in a cleaner way - we can defer RX into a CPU pinned > taskqueue and convert the interrupt handler to a fast handler that > just schedules that taskqueue. We can ignore the ithread entirely here. > > What do you think? Well, it sounds good :) But performance numbers and Jack opinion is more important :) Are you going to Malta? > > Totally pie in the sky handwaving at this point: > > * create an array of mbuf pointers for completed mbufs; > * populate the mbuf array; > * pass the array up to ether_demux(). > > For vlan handling, it may end up populating its own list of mbufs to > push up to ether_demux(). So maybe we should extend the API to have a > bitmap of packets to actually handle from the array, so we can pass up > a larger array of mbufs, note which ones are for the destination and > then the upcall can mark which frames its consumed. > > I specifically wonder how much work/benefit we may see by doing: > > * batching packets into lists so various steps can batch process > things rather than run to completion; > * batching the processing of a list of frames under a single lock > instance - eg, if the forwarding code could do the forwarding lookup > for 'n' packets under a single lock, then pass that list of frames up > to inet_pfil_hook() to do the work under one lock, etc, etc. I'm thinking the same way, but we're stuck with 'forwarding lookup' due to problem with egress interface pointer, as I mention earlier. However it is interesting to see how much it helps, regardless of locking. Currently I'm thinking that we should try to change radix to something different (it seems that it can be checked fast) and see what happened. Luigi's performance numbers for our radix are too awful, and there is a patch implementing alternative trie: http://info.iet.unipi.it/~luigi/papers/20120601-dxr.pdf http://www.nxlab.fer.hr/dxr/stable_8_20120824.diff > > Here, the processing would look less like "grab lock and process to > completion" and more like "mark and sweep" - ie, we have a list of > frames that we mark as needing processing and mark as having been > processed at each layer, so we know where to next dispatch them. > > I still have some tool coding to do with PMC before I even think about > tinkering with this as I'd like to measure stuff like per-packet > latency as well as top-level processing overhead (ie, > CPU_CLK_UNHALTED.THREAD_P / lagg0 TX bytes/pkts, RX bytes/pkts, NIC > interrupts on that core, etc.) That will be great to see! > > Thanks, > > > > -adrian >