From owner-freebsd-net Tue Aug 13 10:22:25 2002 Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C210D37B400; Tue, 13 Aug 2002 10:22:22 -0700 (PDT) Received: from wellington.cnchost.com (wellington.concentric.net [207.155.252.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5483043E42; Tue, 13 Aug 2002 10:22:22 -0700 (PDT) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (adsl-209-204-185-216.sonic.net [209.204.185.216]) by wellington.cnchost.com id NAA21497; Tue, 13 Aug 2002 13:22:21 -0400 (EDT) [ConcentricHost SMTP Relay 1.14] Message-ID: <200208131722.NAA21497@wellington.cnchost.com> To: Luigi Rizzo Cc: Ruslan Ermilov , net@FreeBSD.ORG Subject: Re: Consistency of cached routes In-reply-to: Your message of "Tue, 13 Aug 2002 02:13:33 PDT." <20020813021333.A4507@iguana.icir.org> Date: Tue, 13 Aug 2002 10:22:19 -0700 From: Bakul Shah Sender: owner-freebsd-net@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > ip_forward, ipflow and TCP/UDP sockets cache a copy > of the result of route lookups, but these entries might be > out-of-date when a routing update is performed. Ruslan just MFC'ed > a fix to invalidate the (one-entry) cache in ip_forward, but > the other two can still be inconsistent. If a host route to ADDR is cloned from a route to ADDR/n and a route to ADDR/n+k is added, the cloned route to ADDR must be deleted. This is already taken care of in rt_fixchange() -- see the last comment in this function. The next time a pkt to ADDR is delivered, a new ADDR route will be cloned from ADDR/n+k. TCP/UDP sockets cache a host route. If it is a cloned route it will get the above treatment. So the issue is really only for cached net routes that are less specific than a newly added route. Does ipflow cache those? > ok, so i guess it is time to instrument route lookups and see > how expensive they are, and sort out what is the best way to > solve the tradeoff between caching and potential inconsistencies. Any incosistency needs to be fixed. > The timestamp idea is good because it has constant cost, though > on a box with many routing updates it might completely defeat > the cache. You can use a long long counter as a virtual timestamp. Even if you allow a million route changes per second, this number won't roll over for almost 300000 years! But this will cause all cached entries to be flushed any time the route table was updated + every cache use just got a little more expensive. On the other hand now you can afford to keep a bigger cache. Caching just one forwarding entry doesn't make a lot of sense for a router. > Ideas anyone ? This might be a problem with a known efficient solution. I just thought of a simple way. When a route to ADDR/n is added, for all routes to ADD/n-k, (k in 0,1..n) with refcnt > 0, *move* the route entry! That is, allocate a new data structure, do a memcpy from old to new and move its sole route table reference from the old entry to the new entry. Mark the original rtentry data structure as down. It still exists but is not accessible from the route table; it is accessible only from its cached references. Upon next use of such a ref it will be found to be down, so its refcount gets decremented (and the structure deleted when refcnt drops to 0) and a new rtalloc will be done. Second, this entirely avoids the need to walk the whole tree on addition/deletion/change of any route. Third, we can dispense with the silliness of cloned routes alltogether and go back to keeping a separate host route cache. -- bakul To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-net" in the body of the message