From owner-freebsd-net@FreeBSD.ORG Thu Mar 7 11:44:16 2013 Return-Path: Delivered-To: net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39D33C23; Thu, 7 Mar 2013 11:44:16 +0000 (UTC) (envelope-from melifaro@FreeBSD.org) Received: from mail.ipfw.ru (unknown [IPv6:2a01:4f8:120:6141::2]) by mx1.freebsd.org (Postfix) with ESMTP id EE6C4315; Thu, 7 Mar 2013 11:44:15 +0000 (UTC) Received: from [2a02:6b8:0:401:222:4dff:fe50:cd2f] (helo=dhcp170-36-red.yandex.net) by mail.ipfw.ru with esmtpsa (TLSv1:CAMELLIA256-SHA:256) (Exim 4.76 (FreeBSD)) (envelope-from ) id 1UDZIV-000Aup-L8; Thu, 07 Mar 2013 15:47:43 +0400 Message-ID: <51387D4A.9030408@FreeBSD.org> Date: Thu, 07 Mar 2013 15:43:06 +0400 From: "Alexander V. Chernikov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Andre Oppermann Subject: Re: [patch] interface routes References: <513834E4.7050203@FreeBSD.org> <51384443.5070209@freebsd.org> In-Reply-To: <51384443.5070209@freebsd.org> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: net@freebsd.org X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Mar 2013 11:44:16 -0000 On 07.03.2013 11:39, Andre Oppermann wrote: > On 07.03.2013 07:34, Alexander V. Chernikov wrote: >> Hello list! >> >> There is a known long-lived issue with interface routes >> addition/deletion: >> >> ifconfig iface inet 1.2.3.4/24 can fail if given prefix is already in >> kernel route table (for >> example, advertised by IGP like OSPF). >> >> Interface route can be deleted via route(8) or any route socket user >> (sometimes this happens with >> popular opensource daemons like bird/quagga). >> >> Problem is reported at least in kern/106722 and kern/155772. > > You patch is a welcome addition. > >> This can be fixed the following way: >> Immutable route flag (RTM_PINNED, added in 19995 with 'for future use' >> comment) is utilised to mark >> route 'immutable'. >> rtrequest1_fib refuses to delete routes with given flag unless >> RTM_PINNED is set in rti_flags. > > How do the routing daemons react to being unable to change/delete > such a route? routing daemons live long with the fact that there route socket cmds can fail (and the is route(8) utility which can do anything), so typically bird/quagga yells like 'bird: KRT: Error sending route 11.0.0.0/24 to kernel: File exists' and marks given route as not installed in internal RIB. Additionally, daemon will probably re-try to insert such routes on every periodic KRT rescan (tens of minutes). Given that such sutiations usually happens for a very short time (e.g. physical link flaps) everything should become to normal state quickly. > > EADDRINUSE would likely be a more descriptive error instead of EPERM? Well, not sure if EADDRINUSE is very descriptive for _deleting_ route. "Yes, I know that it is in use so that's the reason I'm trying to delete it". > >> Every interface address manupulation is done via rtinit[1], so >> rtinit1() sets this flag (and behavior does not change here). >> >> Adding interface address is handled via atomically deleting old prefix >> and adding interface one. > > This brings up a long standing sore point of our routing code > which this patch makes more pronounced. When an interface link > state is down I don't want the route to it to persist but to > become inactive so another path can be chosen. This the very > point of running a routing daemon. So on the link-down event > the installed interface routes should be removed from the routing > table. The configured addresses though should persist and the > interface routes re-installed on a link-up event. What's your > opinion on it? This is exactly what is done in current code for IPv4: if_down calls if_unroute(), it cals prctlinput() for every interface address, and domain-dependent function like rip_ctlinput calls in_ifscrub() cleaning given interface route. However, address route (/32) still remains (but route daemons, at least bird, tends to ignore it since it is not listed as valid interface address/mask). This is not done for IPv6 and we should probably do the same. > > Other than these points I think your code is fine and can go > into the tree. > -- WBR, Alexander