Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 09 Sep 2008 21:38:13 +0300
From:      Giorgos Keramidas <keramida@freebsd.org>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-net@freebsd.org
Subject:   Re: rewrite of rt_check() (now rt_check_fib())
Message-ID:  <87y721go6i.fsf@kobe.laptop>
In-Reply-To: <48C1BD31.6090804@elischer.org> (Julian Elischer's message of "Fri, 05 Sep 2008 16:13:53 -0700")
References:  <48C19568.807@elischer.org> <48C1B774.2020405@elischer.org> <48C1B83C.9000404@elischer.org> <48C1BD31.6090804@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--=-=-=
Content-Transfer-Encoding: quoted-printable

Hi Julian.

Has anyone else tested this patch?  I'm going to have a bit of time to
try reproducing this again in the following days.  Is this patch version
the last one you have written?  Should I patch with this one and give it
a try?

FWIW, reading through this version of rt_check_fib() is nicer, and I
really liked the comment that explains how it works :-)

On Fri, 05 Sep 2008 16:13:53 -0700, Julian Elischer <julian@elischer.org> w=
rote:
> this time with less (I hope) bugs...
>
> new macros...
>
> #define RT_TEMP_UNLOCK(_rt) do {                                \
>         RT_ADDREF(_rt);                                         \
>         RT_UNLOCK(_rt);                                         \
> } while (0)
>
> #define RT_RELOCK(_rt) do {                                     \
>         RT_LOCK(_rt)                                            \
>         if ((_rt)->rt_refcnt <=3D 1)                              \
>                 rtfree(_rt);                                    \
>                 _rt =3D 0; /*  signal that it went away */        \
>         else {                                                  \
>                 RT_REMREF(_rt);                                 \
>                 /* note that _rt is still valid */              \
>         }                                                       \
> } while (0)
>
>
> with (better) code attached:
>
> /*
>  * rt_check() is invoked on each layer 2 output path, prior to
>  * encapsulating outbound packets.
>  *
>  * The function is mostly used to find a routing entry for the gateway,
>  * which in some protocol families could also point to the link-level
>  * address for the gateway itself (the side effect of revalidating the
>  * route to the destination is rather pointless at this stage, we did it
>  * already a moment before in the pr_output() routine to locate the ifp
>  * and gateway to use).
>  *
>  * When we remove the layer-3 to layer-2 mapping tables from the
>  * routing table, this function can be removed.
>  *
>  * =3D=3D=3D On input =3D=3D=3D
>  *   *dst is the address of the NEXT HOP (which coincides with the
>  *	final destination if directly reachable);
>  *   *lrt0 points to the cached route to the final destination;
>  *   *lrt is not meaningful;
>  *    fibnum is the index to the correct network fib for this packet
>  *	(*lrt0 has not ref held on it so REMREF is not needed )
>  *
>  * =3D=3D=3D Operation =3D=3D=3D
>  * If the route is marked down try to find a new route.  If the route
>  * to the gateway is gone, try to setup a new route.  Otherwise,
>  * if the route is marked for packets to be rejected, enforce that.
>  * Note that rtalloc returns an rtentry with an extra REF that we need to=
 lose.
>  *
>  * =3D=3D=3D On return =3D=3D=3D
>  *   *dst is unchanged;
>  *   *lrt0 points to the (possibly new) route to the final destination
>  *   *lrt points to the route to the next hop   [LOCKED]
>  *
>  * Their values are meaningful ONLY if no error is returned.
>  *
>  * To follow this you have to remember that:
>  * RT_REMREF reduces the reference count by 1 but doesn't check it for 0 =
(!)
>  * RTFREE_LOCKED includes an RT_REMREF (or an rtfree if refs =3D=3D 1)
>  *    and an RT_UNLOCK
>  * RTFREE does an RT_LOCK and an RTFREE_LOCKED
>  * The gwroute pointer counts as a reference on the rtentry to which it p=
oints.
>  * so when we add it we use the ref that rtalloc gives us and when we los=
e it
>  * we need to remove the reference.
>  */
> int
> rt_check(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr *ds=
t)
> {
> 	return (rt_check_fib(lrt, lrt0, dst, 0));
> }
>
> int
> rt_check_fib(struct rtentry **lrt, struct rtentry **lrt0, struct sockaddr=
 *dst,
> 		u_int fibnum)
> {
> 	struct rtentry *rt;
> 	struct rtentry *rt0;
> 	int error;
>
> 	KASSERT(*lrt0 !=3D NULL, ("rt_check"));
> 	rt0 =3D *lrt0;
> 	rt =3D NULL;
>
> 	/* NB: the locking here is tortuous... */
> 	RT_LOCK(rt0);
> retry:
> 	if (rt0 && (rt0->rt_flags & RTF_UP) =3D=3D 0) {
> 		/* Current rt0 is useless, try get a replacement. */
> 		RT_UNLOCK(rt0);
> 		rt0 =3D NULL;
> 	}
> 	if (rt0 =3D=3D NULL) {
> 		rt0 =3D rtalloc1_fib(dst, 1, 0UL, fibnum);
> 		if (rt0 =3D=3D NULL) {
> 			return (EHOSTUNREACH);
> 		}
> 		RT_REMREF(rt0); /* don't need the reference. */
> 	}
>
> 	if (rt0->rt_flags & RTF_GATEWAY) {
> 		if ((rt =3D rt0->rt_gwroute) !=3D NULL) {
> 			RT_LOCK(rt);		/* NB: gwroute */
> 			if ((rt->rt_flags & RTF_UP) =3D=3D 0) {
> 				/* gw route is dud. ignore/lose it */
> 				RTFREE_LOCKED(rt); /* unref (&unlock) gwroute */
> 				rt =3D rt0->rt_gwroute =3D NULL;
> 			}
> 		}
>=20=09=09
> 		if (rt =3D=3D NULL) {  /* NOT AN ELSE CLAUSE */
> 			RT_TEMP_UNLOCK(rt0); /* MUST return to undo this */
> 			rt =3D rtalloc1_fib(rt0->rt_gateway, 1, 0UL, fibnum);
> 			if ((rt =3D=3D rt0) || (rt =3D=3D NULL)) {
> 				/* the best we can do is not good enough */
> 				if (rt) {
> 					RT_REMREF(rt); /* assumes ref > 0 */
> 					RT_UNLOCK(rt);
> 				}
> 				RT_FREE(rt0); /* lock, unref, (unlock) */
> 				return (ENETUNREACH);
> 			}
> 			/*
> 			 * Relock it and lose the added reference.
> 			 * All sorts of things could have happenned while we
> 			 * had no lock on it, so check for them.
> 			 */
> 			RT_RELOCK(rt0);
> 			if (rt0 =3D=3D NULL || ((rt0->rt_flags & RTF_UP) =3D=3D 0))
> 				/* Ru-roh.. what we had is no longer any good */
> 				goto retry;
> 			/*=20
> 			 * While we were away, someone replaced the gateway.
> 			 * Since a reference count is involved we can't just
> 			 * overwrite it.
> 			 */
> 			if (rt0->rt_gwroute) {
> 				if (rt0->rt_gwroute !=3D rt) {
> 					RT_FREE_LOCKED(rt);
> 					goto retry;
> 				}
> 			} else {
> 				rt0->rt_gwroute =3D rt;
> 			}
> 		}
> 		RT_LOCK_ASSERT(rt);
> 		RT_UNLOCK(rt0);
> 	} else {
> 		/* think of rt as having the lock from now on.. */
> 		rt =3D rt0;
> 	}
> 	/* XXX why are we inspecting rmx_expire? */
> 	if ((rt->rt_flags & RTF_REJECT) &&
> 	    (rt->rt_rmx.rmx_expire =3D=3D 0 ||
> 	    time_uptime < rt->rt_rmx.rmx_expire)) {
> 		RT_UNLOCK(rt);
> 		return (rt =3D=3D rt0 ? EHOSTDOWN : EHOSTUNREACH);
> 	}
>
> 	*lrt =3D rt;
> 	*lrt0 =3D rt0;
> 	return (0);
> }

--=-=-=
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkjGwqAACgkQ1g+UGjGGA7YxYgCeLIxaFENkLPluvhICMOIjfsxP
xHcAoMEJ99l6gQV7lsQKpOPs9/G6EMAT
=/10u
-----END PGP SIGNATURE-----
--=-=-=--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?87y721go6i.fsf>