From owner-freebsd-current@FreeBSD.ORG Wed Aug 10 17:58:49 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF416106564A for ; Wed, 10 Aug 2011 17:58:49 +0000 (UTC) (envelope-from onwahe@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8A7538FC1C for ; Wed, 10 Aug 2011 17:58:49 +0000 (UTC) Received: by yib19 with SMTP id 19so1001424yib.13 for ; Wed, 10 Aug 2011 10:58:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=BTKcn6qoa/yizH0mae2KztDJhimN1WV409zLHMMjzeA=; b=CI9Ydj/SxA38/PfeYp0kF/hVw6HdQEnQ4mH2wWc7SpcDwTIeiSm7ByoLsH/5mvokDC 65KIhxbbcKStmRaGtQZP0wADVD+B5MJyR+aJQ/zRAlVgqqJzp8235x2KO+0gTaJ/5wc2 k8njmrqWm71q0KVXaoy44tzoIFIPkR236ui3g= MIME-Version: 1.0 Received: by 10.142.229.17 with SMTP id b17mr3545976wfh.443.1312999128267; Wed, 10 Aug 2011 10:58:48 -0700 (PDT) Received: by 10.142.144.20 with HTTP; Wed, 10 Aug 2011 10:58:48 -0700 (PDT) In-Reply-To: References: <92B5D566-9816-4134-9358-2306D0F7DAFC@averesystems.com> <1312781293.2521.1.camel@srgsec> <1312960023.2614.12.camel@srgsec> Date: Wed, 10 Aug 2011 19:58:48 +0200 Message-ID: From: Svatopluk Kraus To: "Li, Qing" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: Jeremiah Lott , Kevin Lo , Andrew Boyer , "freebsd-current@freebsd.org" Subject: Re: [patch] Problem with two NIC on same NET (in_scrubprefix: err=17, new prefix add failed) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Aug 2011 17:58:49 -0000 Hi, I try to desribe the whole matter. In the simplest case, I have two routers - each with two un-numbered interfaces connected in parallel. Real data flow is managed in hardware. IP layer is used for managing the hardwares, so I only need to be able to communicate with neighbour thru one from two links. It's not important which one is used, but if one link is down then the other must be used (if it is up). Now, it works on my table and the patches are made public. (To be perfectly honest, sometimes I need to send message thru given interface to know that communication thru this link is possible, but it's another - IP_SENDIF - story and I'm prepared to re-open a thread on it. Well, I've used Bruce M Simpson IP_SENDIF patch on current and it works really fine for me.) I've started my work with not point-to-point interfaces and I've found two problems. The first one - I've mentioned it in the beginning of this thread and the fix is committed already. The second one - submitted patch and description is bellow: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159603 If both interfaces are on same net and one (which installed network route) is going down, then the network route is replaced by network route of the other interface and everything is OK. I can access the network thru the other interface. However, if the second interface is going down too, then the network is replaced by network route of the first interface even if it is down. It is not OK. Futhermore, if the second interface is going up, then the correct (working) network route is not installed as a network route with same prefix is installed already but by interface which is down. So, I can't access the network and it is bad really. I have no problem with loopback routes when I work with not point-to-point interfaces as I can NOT set same source address on them. However, if the interface is going down and up, then loopback route is deleted without checking IFA_RTSELF flag (it must be consistent! especially in kernel) and re-added regardless of "useloopback" setting. So, at least, a loopback route is installed even if useloopback is NOT allowed! After that I've continued with point-to-point interfaces on same net, i.e. I've work with un-numbered interfaces. Firstly, I could not set parallel links on them. The fix is following and is already commmitted: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159600 The test case is simple. When the second (parallel) link was being set then network prefix (already installed by first interface) was not found (lookup by source address instead of destination one) and re-adding was failed obviously as the route really was installed. The bigger problem was with loopback routes on un-numbered interfaces. In in_ifinit(), when un-numbered interface is setting loopback route, then refcount on existing route is incremented and IFA_RTSELF flags is set on its address. This is done if and only if useloopback is set and interface is not IFF_LOOPBACK. It is OK. The rest is hacked somehow and I don't know why. In in_ifscrubprefix() which is used either when address is being deleted or interface is going down, I found first inconsistence. Refcount on existing route is decremented always (in both cases), but the route is deleted only when address is being deleted. Futhermore, IFA_RTSELF flag is NOT cleared when refcount is decremented! The fix and behavour is following: http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159601 When two unnumbered interfaces are set and one of them is going down, then refcount on loopback route is decremented, but IFA_RTSELF flags remains set on interface address. It's inconsistent. After that, if address on second interface is being deleted, then loopback route is deleted too. No loopback route exists, but one interface has IFA_RTSELF flag set. In the view of this inconsistence, I understand a next one in rip_ctlinput(). When interface is going up, then loopback route is being deleted and re-added regardless of IFA_RTSELF flag and useloopback setting. If un-numbered interfaces are used, then it damages refcount on existing loopback route!! IMHO, as loopback routes are not touched when interface is going down, then the routes should be untouched if the interface is going up too. It's the simple patch, but don't work without previous one. However, useloopback complicates it. So, I'm proposing the following patch: If useloopback IS NOT set and IFA_RTSELF IS set, then loopback route is deleted, but with correct refcount game. And if useloopback is SET and IFA_RTSELF IS NOT set and interface IS NOT IFF_LOOPBACK, then loopback route is added, but again with correct refcount game too. It (with previous patch) should ensure IFA_RTSELF and loopback route consistence. Well, it's genesis of the patches and I think it is good case to make it working. On Wed, Aug 10, 2011 at 10:48 AM, Li, Qing wrote: > =A0Hi, > >> >> I've continued with work on two NIC on same NET. Now, with >> point-to-point interfaces too and I have more small fixes which I >> submitted today: >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159600 >> > > =A0 =A0 The fix is not entirely correct. The "rtinitflags()" could set RT= F_HOST flag when the interface > =A0 =A0 =A0type is IFF_LOOPBACK, not necessarily a PPP llink. In fact, I'm not really happy with rtinitflags() used in places, where RTF_HOST flag should be tested. However, even if it is not my code, I understand that the thing with IFF_LOOPBACK could be problem. > >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159601 >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159602 >> http://www.freebsd.org/cgi/query-pr.cgi?pr=3D159603 >> > > =A0 =A0 I need to run some tests on your patch, but keep in mind the LLE_= STATIC is sort overloaded > =A0 =A0 to take care of the case where static routes are maintained in th= e routing table while dynamic > =A0 =A0 routes are removed when the interface is taken down. Yes, I think that LLE_STATIC is a little bit misused too, but I try to make my patch consistent with current implementation. > >> >> I have one more related problem, but I'm not sure how complex the fix sh= ould be. >> >> When an interface is marked down a network route is deleted (or >> replaced) and a loopback route persists in routing table. It is OK. >> However, when an interface is marked up again, then a network route is >> installed unconditionally (but error is ignored) and a loopbak route >> is deleted and added immediately and unconditionally too. IMHO, it is >> not correct behaviour. I think that a second half of in_ifinit() >> should be here starting by in_addprefix() call with some small or >> bigger changes. >> > > =A0 =A0 =A0Unless you have a really good reason, other than code inspecti= on, and have > =A0 =A0 =A0a set of test cases, please leave this code alone for now. See= below ... I have good reason, but I can hack kernel just for me only in worse scenario. However, I always try to minimalize the hacks count. > >> >> Maybe, adding network route and ignoring error could be OK, but >> deleting loopback route should be done under IFA_RTSELF flag is set >> condition (with existing route refcount check). V_useloopback should >> be check before re-adding the route and existing route must be check >> to evaluate refcount correctly. The proposed patch is attached. >> > > =A0 =A0 =A0Did you read my code comment in "in.c", at line 1115 ? Yes, I've read it. However, it's really done only when an interface address is being deleted and LLE_STATIC is given. If an interface is going down, then loopback route is untouched (except of mentioned inconsistence with refcount). > >> >> However, I prefer to call in_addprefix() (which is static now) instead >> of rtinit() and add some more checks from in_ifinit(). Can you (or >> anyone) review the patch? >> > > =A0 =A0 There are quite a few cases to cover, including bootp, which take= s a different code path > =A0 =A0 than DHCP through the routing code. I would appreciate that you t= est these cases before > =A0 =A0 making any code commits. It's taken some time to get the overall = routing code stabilized. > =A0 =A0 There is still a bug in the Radix code that needs fixing. > > =A0 =A0-- Qing I understand, but I use my own DHCP client. Well, I try to look at it, but maybe, someone else can test it. Svata