From owner-freebsd-net@FreeBSD.ORG Wed Sep 10 00:25:17 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A1ACB1065676 for ; Wed, 10 Sep 2008 00:25:17 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outU.internet-mail-service.net (outu.internet-mail-service.net [216.240.47.244]) by mx1.freebsd.org (Postfix) with ESMTP id 8D8748FC18 for ; Wed, 10 Sep 2008 00:25:17 +0000 (UTC) (envelope-from julian@elischer.org) Received: from idiom.com (mx0.idiom.com [216.240.32.160]) by out.internet-mail-service.net (Postfix) with ESMTP id 67510246E; Tue, 9 Sep 2008 17:25:17 -0700 (PDT) Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id A85A62D600F; Tue, 9 Sep 2008 17:25:16 -0700 (PDT) Message-ID: <48C713F9.7010500@elischer.org> Date: Tue, 09 Sep 2008 17:25:29 -0700 From: Julian Elischer User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707) MIME-Version: 1.0 To: Giorgos Keramidas References: <48C19568.807@elischer.org> <48C1B774.2020405@elischer.org> <48C1B83C.9000404@elischer.org> <48C1BD31.6090804@elischer.org> <87y721go6i.fsf@kobe.laptop> In-Reply-To: <87y721go6i.fsf@kobe.laptop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org Subject: Re: rewrite of rt_check() (now rt_check_fib()) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Sep 2008 00:25:17 -0000 Giorgos Keramidas wrote: > Hi Julian. > > Has anyone else tested this patch? I'm going to have a bit of time to > try reproducing this again in the following days. Is this patch version > the last one you have written? Should I patch with this one and give it > a try? I think this was the last one. > > FWIW, reading through this version of rt_check_fib() is nicer, and I > really liked the comment that explains how it works :-) > > On Fri, 05 Sep 2008 16:13:53 -0700, Julian Elischer wrote: >> this time with less (I hope) bugs... >> >> new macros... >> >> #define RT_TEMP_UNLOCK(_rt) do { \ >> RT_ADDREF(_rt); \ >> RT_UNLOCK(_rt); \ >> } while (0) >> >> #define RT_RELOCK(_rt) do { \ >> RT_LOCK(_rt) \ >> if ((_rt)->rt_refcnt <= 1) \ >> rtfree(_rt); \ >> _rt = 0; /* signal that it went away */ \ >> else { \ >> RT_REMREF(_rt); \ >> /* note that _rt is still valid */ \ >> } \ >> } while (0) >> >> >> with (better) code attached: >> >> /* >> * rt_check() is invoked on each layer 2 output path, prior to >> * encapsulating outbound packets. >> * >> * The function is mostly used to find a routing entry for the gateway, >> * which in some protocol families could also point to the link-level >> * address for the gateway itself (the side effect of revalidating the >> * route to the destination is rather pointless at this stage, we did it >> * already a moment before in the pr_output() routine to locate the ifp >> * and gateway to use). >> * >> * When we remove the layer-3 to layer-2 mapping tables from the >> * routing table, this function can be removed. >> * >> * === On input === >> * *dst is the address of the NEXT HOP (which coincides with the >> * final destination if directly reachable); >> * *lrt0 points to the cached route to the final destination; >> * *lrt is not meaningful; >> * fibnum is the index to the correct network fib for this packet >> * (*lrt0 has not ref held on it so REMREF is not needed ) >> * >> * === Operation === >> * If the route is marked down try to find a new route. If the route >> * to the gateway is gone, try to setup a new route. Otherwise, >> * if the route is marked for packets to be rejected, enforce that. >> * Note that rtalloc returns an rtentry with an extra REF that we need to lose. >> * >> * === On return === >> * *dst is unchanged; >> * *lrt0 points to the (possibly new) route to the final destination >> * *lrt points to the route to the next hop [LOCKED] >> * >> * Their values are meaningful ONLY if no error is returned. >> * >> * To follow this you have to remember that: >> * RT_REMREF reduces the reference count by 1 but doesn't check it for 0 (!) >> * RTFREE_LOCKED includes an RT_REMREF (or an rtfree if refs == 1) >> * and an RT_UNLOCK >> * RTFREE does an RT_LOCK and an RTFREE_LOCKED >> * The gwroute pointer counts as a reference on the rtentry to which it points. >> * so when we add it we use the ref that rtalloc gives us and when we lose it >> * we need to remove the reference. >> */ >> int >> rt_check(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *dst) >> { >> return (rt_check_fib(lrt, lrt0, dst, 0)); >> } >> >> int >> rt_check_fib(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *dst, >> u_int fibnum) >> { >> struct rtentry *rt; >> struct rtentry *rt0; >> int error; >> >> KASSERT(*lrt0 != NULL, ("rt_check")); >> rt0 = *lrt0; >> rt = NULL; >> >> /* NB: the locking here is tortuous... */ >> RT_LOCK(rt0); >> retry: >> if (rt0 && (rt0->rt_flags & RTF_UP) == 0) { >> /* Current rt0 is useless, try get a replacement. */ >> RT_UNLOCK(rt0); >> rt0 = NULL; >> } >> if (rt0 == NULL) { >> rt0 = rtalloc1_fib(dst, 1, 0UL, fibnum); >> if (rt0 == NULL) { >> return (EHOSTUNREACH); >> } >> RT_REMREF(rt0); /* don't need the reference. */ >> } >> >> if (rt0->rt_flags & RTF_GATEWAY) { >> if ((rt = rt0->rt_gwroute) != NULL) { >> RT_LOCK(rt); /* NB: gwroute */ >> if ((rt->rt_flags & RTF_UP) == 0) { >> /* gw route is dud. ignore/lose it */ >> RTFREE_LOCKED(rt); /* unref (&unlock) gwroute */ >> rt = rt0->rt_gwroute = NULL; >> } >> } >> >> if (rt == NULL) { /* NOT AN ELSE CLAUSE */ >> RT_TEMP_UNLOCK(rt0); /* MUST return to undo this */ >> rt = rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum); >> if ((rt == rt0) || (rt == NULL)) { >> /* the best we can do is not good enough */ >> if (rt) { >> RT_REMREF(rt); /* assumes ref > 0 */ >> RT_UNLOCK(rt); >> } >> RT_FREE(rt0); /* lock, unref, (unlock) */ >> return (ENETUNREACH); >> } >> /* >> * Relock it and lose the added reference. >> * All sorts of things could have happenned while we >> * had no lock on it, so check for them. >> */ >> RT_RELOCK(rt0); >> if (rt0 == NULL || ((rt0->rt_flags & RTF_UP) == 0)) >> /* Ru-roh.. what we had is no longer any good */ >> goto retry; >> /* >> * While we were away, someone replaced the gateway. >> * Since a reference count is involved we can't just >> * overwrite it. >> */ >> if (rt0->rt_gwroute) { >> if (rt0->rt_gwroute != rt) { >> RT_FREE_LOCKED(rt); >> goto retry; >> } >> } else { >> rt0->rt_gwroute = rt; >> } >> } >> RT_LOCK_ASSERT(rt); >> RT_UNLOCK(rt0); >> } else { >> /* think of rt as having the lock from now on.. */ >> rt = rt0; >> } >> /* XXX why are we inspecting rmx_expire? */ >> if ((rt->rt_flags & RTF_REJECT) && >> (rt->rt_rmx.rmx_expire == 0 || >> time_uptime < rt->rt_rmx.rmx_expire)) { >> RT_UNLOCK(rt); >> return (rt == rt0 ? EHOSTDOWN : EHOSTUNREACH); >> } >> >> *lrt = rt; >> *lrt0 = rt0; >> return (0); >> }