From owner-freebsd-net@FreeBSD.ORG Tue Jun 19 12:51:58 2007 Return-Path: X-Original-To: net@FreeBSD.org Delivered-To: freebsd-net@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9387216A468; Tue, 19 Jun 2007 12:51:58 +0000 (UTC) (envelope-from bms@incunabulum.net) Received: from out1.smtp.messagingengine.com (out1.smtp.messagingengine.com [66.111.4.25]) by mx1.freebsd.org (Postfix) with ESMTP id 5709A13C44C; Tue, 19 Jun 2007 12:51:58 +0000 (UTC) (envelope-from bms@incunabulum.net) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 8896316CA; Tue, 19 Jun 2007 08:34:17 -0400 (EDT) Received: from heartbeat1.messagingengine.com ([10.202.2.160]) by compute1.internal (MEProxy); Tue, 19 Jun 2007 08:34:17 -0400 X-Sasl-enc: YaRkPNdiI2ie5765ZoTmp2qkiZPJnuvPkAJn0b6CZKVD 1182256461 Received: from [192.168.123.18] (82-35-112-254.cable.ubr07.dals.blueyonder.co.uk [82.35.112.254]) by mail.messagingengine.com (Postfix) with ESMTP id 5EFA22C16; Tue, 19 Jun 2007 08:34:21 -0400 (EDT) Message-ID: <4677CD3E.8080903@incunabulum.net> Date: Tue, 19 Jun 2007 13:34:06 +0100 From: "Bruce M. Simpson" User-Agent: Thunderbird 1.5.0.12 (Windows/20070509) MIME-Version: 1.0 To: Julian Elischer References: <20070618142009.A40302@xorpc.icir.org> <467785F2.5090806@elischer.org> In-Reply-To: <467785F2.5090806@elischer.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: qingli@FreeBSD.org, Luigi Rizzo , Gleb Smirnoff , net@FreeBSD.org, Qing Li Subject: Re: new ARP code review X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jun 2007 12:51:58 -0000 Julian Elischer wrote: > > I have some thoughts on this. > firstly, while it is interesting to have an arp table (ok LLA table) > on each interface, I'm not sure that it gains you very much. Unfortunately maintaining a single ARP table is insufficient for supporting multiple paths within the IPv4 stack. Even without supporting multiple routing paths, we would still need to break out the ARP cache in this way so as to support being attached to the same layer 2 domain properly (ie two network cards on the same Ethernet segment or switch). At the moment if_bridge and netgraph are our get-out-of-jail-free cards, they cause the IPv4 stack to be bypassed. > > As mentioned elsewhere, the connection of the arp information with the > routing table menas that the arp lookup is virtually free. > Or, at least it used to be in the Uniprocessor world. It's hard to > beat free. It's hard to beat hard figures, which is something we don't have at the moment. What we do have is a set of design considerations. Intuition would suggest that one lock performs better than two, however, it depends on the nature of the lock and on the nature of the data structure lookup. > > The comment "Eventually, with this structure you can do the route lookup > only when you need to find the next hop (e.g. when a route > changes etc.) and just the much-cheaper L3-L2 map in other cases." > makes me wonder..If we are not caching the arp code in the route any > more, > then how do we avoid doing a route lookup on each packet? I don't think you can ever avoid doing a lookup of any kind per packet if you're running a router. What you can do is amortize lookup cost over time, e.g. two expensive initial lookups followed by one cheaper lookup for subsequent packets. Whatever happens, though, has to play nice with policy forwarding and source selection. This is what complicates matters - otherwise I'd just suggest keeping a per-interface hash of ARP entries, an IPv4 routing trie, and a per-destination cache hash which returns the combined lookup against the trie and the L2 hash -- pretty much what Luigi is suggesting. > > BTW having a per interface arp table does make sense if there a s a > particular > thread that is responsible for that interface as only it would need > access to teh table and it could be done lock-free if one was careful > enough. The ARP code has to change, that much is certain, but the locking strategy has yet to be decided. ARP entries are read far more often than they are written, so it seems reasonable that a different lock is used. BMS