From owner-freebsd-net@FreeBSD.ORG Fri Sep 5 20:24:02 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C4AB8106567D for ; Fri, 5 Sep 2008 20:24:02 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outH.internet-mail-service.net (outh.internet-mail-service.net [216.240.47.231]) by mx1.freebsd.org (Postfix) with ESMTP id A73EF8FC1B for ; Fri, 5 Sep 2008 20:24:02 +0000 (UTC) (envelope-from julian@elischer.org) Received: from idiom.com (mx0.idiom.com [216.240.32.160]) by out.internet-mail-service.net (Postfix) with ESMTP id 225E22488 for ; Fri, 5 Sep 2008 13:24:03 -0700 (PDT) Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id 0682A2D601E for ; Fri, 5 Sep 2008 13:24:01 -0700 (PDT) Message-ID: <48C19568.807@elischer.org> Date: Fri, 05 Sep 2008 13:24:08 -0700 From: Julian Elischer User-Agent: Thunderbird 2.0.0.16 (Macintosh/20080707) MIME-Version: 1.0 To: FreeBSD Net Content-Type: multipart/mixed; boundary="------------040202050603070306030307" Subject: rewrite of rt_check() (now rt_check_fib()) X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Sep 2008 20:24:02 -0000 This is a multi-part message in MIME format. --------------040202050603070306030307 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit In tryin gto understand rt_check_fib() (wsa rt_check()) I ended up rewriting it to do what I thought it was trying to do.. this stops the panics some people have seen, but allows the system to stay up long enough to see some other problem.. anyhow this si the patch: comments? --------------040202050603070306030307 Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0"; name="adiff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="adiff" Index: route.c =================================================================== RCS file: /home/ncvs/src/sys/net/route.c,v retrieving revision 1.134 diff -u -r1.134 route.c --- route.c 1 Sep 2008 17:15:29 -0000 1.134 +++ route.c 5 Sep 2008 20:23:32 -0000 @@ -1676,16 +1676,18 @@ * *lrt0 points to the cached route to the final destination; * *lrt is not meaningful; * fibnum is the index to the correct network fib for this packet + * (*lrt0 has not ref held on it so REMREF is not needed ) * * === Operation === * If the route is marked down try to find a new route. If the route * to the gateway is gone, try to setup a new route. Otherwise, * if the route is marked for packets to be rejected, enforce that. + * Note that rtalloc returns an rtentry with an extra REF that we need to lose. * * === On return === * *dst is unchanged; * *lrt0 points to the (possibly new) route to the final destination - * *lrt points to the route to the next hop + * *lrt points to the route to the next hop [LOCKED] * * Their values are meaningful ONLY if no error is returned. */ @@ -1704,49 +1706,56 @@ int error; KASSERT(*lrt0 != NULL, ("rt_check")); - rt = rt0 = *lrt0; + rt0 = *lrt0; + rt = NULL; /* NB: the locking here is tortuous... */ - RT_LOCK(rt); - if ((rt->rt_flags & RTF_UP) == 0) { - RT_UNLOCK(rt); - rt = rtalloc1_fib(dst, 1, 0UL, fibnum); - if (rt != NULL) { - RT_REMREF(rt); + RT_LOCK(rt0); + if ((rt0->rt_flags & RTF_UP) == 0) { + /* Current rt0 is useless, try get a replacement. */ + RT_UNLOCK(rt0); + rt0 = rtalloc1_fib(dst, 1, 0UL, fibnum); + if ((rt0 != NULL)/* && (rt0->rt_flags & RTF_UP) */) { + RT_REMREF(rt0); /* XXX what about if change? */ - } else + } else { return (EHOSTUNREACH); - rt0 = rt; + } } + /* XXX BSD/OS checks dst->sa_family != AF_NS */ - if (rt->rt_flags & RTF_GATEWAY) { - if (rt->rt_gwroute == NULL) - goto lookup; - rt = rt->rt_gwroute; - RT_LOCK(rt); /* NB: gwroute */ - if ((rt->rt_flags & RTF_UP) == 0) { - RTFREE_LOCKED(rt); /* unlock gwroute */ - rt = rt0; - rt0->rt_gwroute = NULL; - lookup: + if (rt0->rt_flags & RTF_GATEWAY) { + if ((rt = rt0->rt_gwroute) != NULL) { + RT_LOCK(rt); /* NB: gwroute */ + if ((rt->rt_flags & RTF_UP) == 0) { + /* gw route is dud. ignore/lose it */ + RTFREE_LOCKED(rt); /* unlock gwroute */ + rt = rt0->rt_gwroute = NULL; + } + } + + + if (rt == NULL) { /* NOT AN ELSE CLAUSE */ RT_UNLOCK(rt0); -/* XXX MRT link level looked up in table 0 */ - rt = rtalloc1_fib(rt->rt_gateway, 1, 0UL, 0); + /* XXX MRT link level looked up in table 0 */ + rt = rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum); if (rt == rt0) { - RT_REMREF(rt0); - RT_UNLOCK(rt0); + /* the best we can do is not good enough */ + /* XXX does this happen? */ + RT_REMREF(rt); + RT_UNLOCK(rt); return (ENETUNREACH); } RT_LOCK(rt0); - if (rt0->rt_gwroute != NULL) - RTFREE(rt0->rt_gwroute); - rt0->rt_gwroute = rt; if (rt == NULL) { RT_UNLOCK(rt0); return (EHOSTUNREACH); } + rt0->rt_gwroute = rt; } RT_UNLOCK(rt0); + } else { + rt = rt0; } /* XXX why are we inspecting rmx_expire? */ error = (rt->rt_flags & RTF_REJECT) && --------------040202050603070306030307--