From owner-freebsd-net@FreeBSD.ORG Tue Apr 24 16:18:30 2012 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C28C71065677; Tue, 24 Apr 2012 16:18:30 +0000 (UTC) (envelope-from kmacybsd@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id 74B118FC19; Tue, 24 Apr 2012 16:18:30 +0000 (UTC) Received: by iahk25 with SMTP id k25so1570753iah.13 for ; Tue, 24 Apr 2012 09:18:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=nRZxGhtqAKb5ZuOonbUPX+zKHWSU0cZYOaXj9eySR/E=; b=o2h7Wwwi82dMGOJCyLsUzKAZupEJ0Ny9nxw/w7Xn6H9lsPTwnaSrVK7bpt9bThwSZu 4uCH+HUYl4otufrZC+yfMK1xQewC6r6d6JpcwI0YxFuSUdcQW/MBqsUE/QiQqMsqnXEy oNLVDZjRdxqVMWrZaj/fv3H/RHnYeF1T5Wzvyq7Pc4Pd13EYTI5CLtqK/XBnhDeEX9mU zWD+6HOMPeQdZMXDmXMxZlKRlROciZB9HRilgqhneqW9lS9yO714S9Tm3KPLNcj1MMrV eIoyvLEcvHeSGos8HzjCEgVQQDJKZ+GgkIsXBYS8ruW5D3RZEFLGQH43nS6Itj/MnhDY 6oKg== MIME-Version: 1.0 Received: by 10.50.194.232 with SMTP id hz8mr10939191igc.38.1335284309757; Tue, 24 Apr 2012 09:18:29 -0700 (PDT) Sender: kmacybsd@gmail.com Received: by 10.50.129.39 with HTTP; Tue, 24 Apr 2012 09:18:29 -0700 (PDT) In-Reply-To: <20120424163423.GA59530@onelab2.iet.unipi.it> References: <20120419133018.GA91364@onelab2.iet.unipi.it> <4F907011.9080602@freebsd.org> <20120419204622.GA94904@onelab2.iet.unipi.it> <20120424163423.GA59530@onelab2.iet.unipi.it> Date: Tue, 24 Apr 2012 18:18:29 +0200 X-Google-Sender-Auth: Y42oDrej-EVrsrWTvXr2bEEFpgQ Message-ID: From: "K. Macy" To: Luigi Rizzo Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: "Li, Qing" , "current@freebsd.org" , "net@freebsd.org" Subject: Re: Some performance measurements on the FreeBSD network stack X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Apr 2012 16:18:30 -0000 On Tue, Apr 24, 2012 at 6:34 PM, Luigi Rizzo wrote: > On Tue, Apr 24, 2012 at 02:16:18PM +0000, Li, Qing wrote: >> > >> >From previous tests, the difference between flowtable and >> >routing table was small with a single process (about 5% or 50ns >> >in the total packet processing time, if i remember well), >> >but there was a large gain with multiple concurrent processes. >> > >> >> Yes, that sounds about right when we did the tests a long while ago. >> >> > >> > Removing flowtable increases the cost in ip_output() >> > (obviously) but also in ether_output() (because the >> > route does not have a lle entry so you need to call >> > arpresolve on each packet). >> > >> >> Yup. >> >> > >> > So in revising the route lookup i believe it would be good >> > if we could also get at once most of the info that >> > ether_output() is computing again and again. >> > >> >> Well, the routing table no longer maintains any lle info, so there >> isn't much to copy out the rtentry at the completion of route >> lookup. >> >> If I understood you correctly, you do believe there is a lot of value >> in Flowtable caching concept, but you are not suggesting we reverting >> back to having the routing table maintain L2 entries, are you ? > > I see a lot of value in caching in general. > > Especially for a bound socket it seems pointless to lookup the > route, iface and mac address(es) on every single packet instead of > caching them. And, routes and MAC addresses are volatile anyways > so making sure that we do the lookup 1us closer to the actual use > gives no additional guarantee. > > The frequency with which these info (routes and MAC addresses) > change clearly influences the mechanism to validate the cache. > I suppose we have the following options: > > - direct notification: a failure in a direct chain of calls > =A0can be used to invalidate the info cached in the socket. > =A0Similarly, some incoming traffic (e.g. TCP RST, FIN, > =A0ICMP messages) that reach a socket can invalidate the cached values > - assume a minimum lifetime for the info (i think this is what > =A0happens in the flowtable) and flush it unconditionally > =A0every such interval (say 10ms). > - if some info changes infrequently (e.g. MAC addresses) one could > =A0put a version number in the cached value and use it to validate > =A0the cache. I have a patch that has been sitting around for a long time due to review cycle latency that caches a pointer to the rtentry (and llentry) in the the inpcb. Before each use the rtentry is checked against a generation number in the routing tree that is incremented on every routing table update.