From owner-dev-commits-src-branches@freebsd.org Tue Sep 7 21:12:45 2021 Return-Path: Delivered-To: dev-commits-src-branches@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id F2F3D66C3EC; Tue, 7 Sep 2021 21:12:44 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4H3yf14QBvz4VqX; Tue, 7 Sep 2021 21:12:41 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0AC2219948; Tue, 7 Sep 2021 21:12:41 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 187LCeiI023704; Tue, 7 Sep 2021 21:12:40 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 187LCetg023703; Tue, 7 Sep 2021 21:12:40 GMT (envelope-from git) Date: Tue, 7 Sep 2021 21:12:40 GMT Message-Id: <202109072112.187LCetg023703@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: "Alexander V. Chernikov" Subject: git: 311cf25c240b - stable/13 - Simplify ifa/ifp refcounting in the routing stack. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: melifaro X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 311cf25c240b8838cee5a1afed5b6e8647e21330 Auto-Submitted: auto-generated X-BeenThere: dev-commits-src-branches@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Commits to the stable branches of the FreeBSD src repository List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Sep 2021 21:12:45 -0000 The branch stable/13 has been updated by melifaro: URL: https://cgit.FreeBSD.org/src/commit/?id=311cf25c240b8838cee5a1afed5b6e8647e21330 commit 311cf25c240b8838cee5a1afed5b6e8647e21330 Author: Alexander V. Chernikov AuthorDate: 2021-02-22 21:42:27 +0000 Commit: Alexander V. Chernikov CommitDate: 2021-09-07 20:55:51 +0000 Simplify ifa/ifp refcounting in the routing stack. The routing stack control depends on quite a tree of functions to determine the proper attributes of a route such as a source address (ifa) or transmit ifp of a route. When actually inserting a route, the stack needs to ensure that ifa and ifp points to the entities that are still valid. Validity means slightly more than just pointer validity - stack need guarantee that the provided objects are not scheduled for deletion. Currently, callers either ignore it (most ifp parts, historically) or try to use refcounting (ifa parts). Even in case of ifa refcounting it's not always implemented in fully-safe manner. For example, some codepaths inside rt_getifa_fib() are referencing ifa while not holding any locks, resulting in possibility of referencing scheduled-for-deletion ifa. Instead of trying to fix all of the callers by enforcing proper refcounting, switch to a different model. As the rib_action() already requires epoch, do not require any stability guarantees other than the epoch-provided one. Use newly-added conditional versions of the refcounting functions (ifa_try_ref(), if_try_ref()) and fail if any of these fails. Reviewed by: donner Differential Revision: https://reviews.freebsd.org/D28837 (cherry picked from commit 596417283722ee62ed17aed1c875ad90c01cbb0e) --- sys/net/route.c | 14 +++-------- sys/net/route/nhop_ctl.c | 58 ++++++++++++++++++++++++++----------------- sys/net/route/route_ctl.c | 17 ++----------- sys/net/route/route_ifaddrs.c | 12 ++------- 4 files changed, 42 insertions(+), 59 deletions(-) diff --git a/sys/net/route.c b/sys/net/route.c index f07cb3f6581a..2416aa9a983f 100644 --- a/sys/net/route.c +++ b/sys/net/route.c @@ -207,7 +207,6 @@ rib_add_redirect(u_int fibnum, struct sockaddr *dst, struct sockaddr *gateway, /* Get the best ifa for the given interface and gateway. */ if ((ifa = ifaof_ifpforaddr(gateway, ifp)) == NULL) return (ENETUNREACH); - ifa_ref(ifa); bzero(&info, sizeof(info)); info.rti_info[RTAX_DST] = dst; @@ -224,7 +223,6 @@ rib_add_redirect(u_int fibnum, struct sockaddr *dst, struct sockaddr *gateway, info.rti_rmx = &rti_rmx; error = rib_action(fibnum, RTM_ADD, &info, &rc); - ifa_free(ifa); if (error != 0) { /* TODO: add per-fib redirect stats. */ @@ -503,8 +501,7 @@ rt_flushifroutes(struct ifnet *ifp) } /* - * Look up rt_addrinfo for a specific fib. Note that if rti_ifa is defined, - * it will be referenced so the caller must free it. + * Look up rt_addrinfo for a specific fib. * * Assume basic consistency checks are executed by callers: * RTAX_DST exists, if RTF_GATEWAY is set, RTAX_GATEWAY exists as well. @@ -513,8 +510,7 @@ int rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum) { const struct sockaddr *dst, *gateway, *ifpaddr, *ifaaddr; - struct epoch_tracker et; - int needref, error, flags; + int error, flags; dst = info->rti_info[RTAX_DST]; gateway = info->rti_info[RTAX_GATEWAY]; @@ -527,8 +523,6 @@ rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum) * when protocol address is ambiguous. */ error = 0; - needref = (info->rti_ifa == NULL); - NET_EPOCH_ENTER(et); /* If we have interface specified by the ifindex in the address, use it */ if (info->rti_ifp == NULL && ifpaddr != NULL && @@ -583,13 +577,11 @@ rt_getifa_fib(struct rt_addrinfo *info, u_int fibnum) info->rti_ifa = ifa_ifwithroute(flags, sa, sa, fibnum); } - if (needref && info->rti_ifa != NULL) { + if (info->rti_ifa != NULL) { if (info->rti_ifp == NULL) info->rti_ifp = info->rti_ifa->ifa_ifp; - ifa_ref(info->rti_ifa); } else error = ENETUNREACH; - NET_EPOCH_EXIT(et); return (error); } diff --git a/sys/net/route/nhop_ctl.c b/sys/net/route/nhop_ctl.c index 7de553799fab..92b43871d604 100644 --- a/sys/net/route/nhop_ctl.c +++ b/sys/net/route/nhop_ctl.c @@ -84,7 +84,7 @@ static int get_nhop(struct rib_head *rnh, struct rt_addrinfo *info, struct nhop_priv **pnh_priv); static int finalize_nhop(struct nh_control *ctl, struct rt_addrinfo *info, struct nhop_priv *nh_priv); -static struct ifnet *get_aifp(const struct nhop_object *nh, int reference); +static struct ifnet *get_aifp(const struct nhop_object *nh); static void fill_sdl_from_ifp(struct sockaddr_dl_short *sdl, const struct ifnet *ifp); static void destroy_nhop_epoch(epoch_context_t ctx); @@ -120,12 +120,10 @@ nhops_init(void) * this interface ifp instead of loopback. This is needed to support * link-local IPv6 loopback communications. * - * If @reference is non-zero, found ifp is referenced. - * * Returns found ifp. */ static struct ifnet * -get_aifp(const struct nhop_object *nh, int reference) +get_aifp(const struct nhop_object *nh) { struct ifnet *aifp = NULL; @@ -138,21 +136,15 @@ get_aifp(const struct nhop_object *nh, int reference) */ if ((nh->nh_ifp->if_flags & IFF_LOOPBACK) && nh->gw_sa.sa_family == AF_LINK) { - if (reference) - aifp = ifnet_byindex_ref(nh->gwl_sa.sdl_index); - else - aifp = ifnet_byindex(nh->gwl_sa.sdl_index); + aifp = ifnet_byindex(nh->gwl_sa.sdl_index); if (aifp == NULL) { DPRINTF("unable to get aifp for %s index %d", if_name(nh->nh_ifp), nh->gwl_sa.sdl_index); } } - if (aifp == NULL) { + if (aifp == NULL) aifp = nh->nh_ifp; - if (reference) - if_ref(aifp); - } return (aifp); } @@ -297,7 +289,7 @@ fill_nhop_from_info(struct nhop_priv *nh_priv, struct rt_addrinfo *info) nh->nh_ifp = info->rti_ifa->ifa_ifp; nh->nh_ifa = info->rti_ifa; /* depends on the gateway */ - nh->nh_aifp = get_aifp(nh, 0); + nh->nh_aifp = get_aifp(nh); /* * Note some of the remaining data is set by the @@ -438,7 +430,7 @@ alter_nhop_from_info(struct nhop_object *nh, struct rt_addrinfo *info) nh->nh_ifa = info->rti_ifa; if (info->rti_ifp != NULL) nh->nh_ifp = info->rti_ifp; - nh->nh_aifp = get_aifp(nh, 0); + nh->nh_aifp = get_aifp(nh); return (0); } @@ -512,6 +504,26 @@ alloc_nhop_structure() return (nh_priv); } +static bool +reference_nhop_deps(struct nhop_object *nh) +{ + if (!ifa_try_ref(nh->nh_ifa)) + return (false); + nh->nh_aifp = get_aifp(nh); + if (!if_try_ref(nh->nh_aifp)) { + ifa_free(nh->nh_ifa); + return (false); + } + DPRINTF("AIFP: %p nh_ifp %p", nh->nh_aifp, nh->nh_ifp); + if (!if_try_ref(nh->nh_ifp)) { + ifa_free(nh->nh_ifa); + if_rele(nh->nh_aifp); + return (false); + } + + return (true); +} + /* * Alocates/references the remaining bits of nexthop data and links * it to the hash table. @@ -522,9 +534,7 @@ static int finalize_nhop(struct nh_control *ctl, struct rt_addrinfo *info, struct nhop_priv *nh_priv) { - struct nhop_object *nh; - - nh = nh_priv->nh; + struct nhop_object *nh = nh_priv->nh; /* Allocate per-cpu packet counter */ nh->nh_pksent = counter_u64_alloc(M_NOWAIT); @@ -535,15 +545,17 @@ finalize_nhop(struct nh_control *ctl, struct rt_addrinfo *info, return (ENOMEM); } + if (!reference_nhop_deps(nh)) { + counter_u64_free(nh->nh_pksent); + uma_zfree(nhops_zone, nh); + RTSTAT_INC(rts_nh_alloc_failure); + DPRINTF("nh_alloc_finalize failed - reference failure"); + return (EAGAIN); + } + /* Save vnet to ease destruction */ nh_priv->nh_vnet = curvnet; - /* Reference external objects and calculate (referenced) ifa */ - if_ref(nh->nh_ifp); - ifa_ref(nh->nh_ifa); - nh->nh_aifp = get_aifp(nh, 1); - DPRINTF("AIFP: %p nh_ifp %p", nh->nh_aifp, nh->nh_ifp); - refcount_init(&nh_priv->nh_refcnt, 1); /* Please see nhop_free() comments on the initial value */ diff --git a/sys/net/route/route_ctl.c b/sys/net/route/route_ctl.c index 2ec25c94299d..a0c4a2283a00 100644 --- a/sys/net/route/route_ctl.c +++ b/sys/net/route/route_ctl.c @@ -600,12 +600,9 @@ create_rtentry(struct rib_head *rnh, struct rt_addrinfo *info, error = rt_getifa_fib(info, rnh->rib_fibnum); if (error) return (error); - } else { - ifa_ref(info->rti_ifa); } error = nhop_create_from_info(rnh, info, &nh); - ifa_free(info->rti_ifa); if (error != 0) return (error); @@ -923,7 +920,6 @@ static int change_nhop(struct rib_head *rnh, struct rt_addrinfo *info, struct nhop_object *nh_orig, struct nhop_object **nh_new) { - int free_ifa = 0; int error; /* @@ -937,24 +933,15 @@ change_nhop(struct rib_head *rnh, struct rt_addrinfo *info, (info->rti_info[RTAX_IFA] != NULL && !sa_equal(info->rti_info[RTAX_IFA], nh_orig->nh_ifa->ifa_addr))) { error = rt_getifa_fib(info, rnh->rib_fibnum); - if (info->rti_ifa != NULL) - free_ifa = 1; if (error != 0) { - if (free_ifa) { - ifa_free(info->rti_ifa); - info->rti_ifa = NULL; - } - + info->rti_ifa = NULL; return (error); } } error = nhop_create_from_nhop(rnh, nh_orig, info, nh_new); - if (free_ifa) { - ifa_free(info->rti_ifa); - info->rti_ifa = NULL; - } + info->rti_ifa = NULL; return (error); } diff --git a/sys/net/route/route_ifaddrs.c b/sys/net/route/route_ifaddrs.c index 853f7f8fbe15..15ee13201059 100644 --- a/sys/net/route/route_ifaddrs.c +++ b/sys/net/route/route_ifaddrs.c @@ -143,7 +143,6 @@ ifa_maintain_loopback_route(int cmd, const char *otype, struct ifaddr *ifa, struct rt_addrinfo info; struct sockaddr_dl null_sdl; struct ifnet *ifp; - struct ifaddr *rti_ifa = NULL; ifp = ifa->ifa_ifp; @@ -153,12 +152,8 @@ ifa_maintain_loopback_route(int cmd, const char *otype, struct ifaddr *ifa, info.rti_ifp = V_loif; if (cmd == RTM_ADD) { /* explicitly specify (loopback) ifa */ - if (info.rti_ifp != NULL) { - rti_ifa = ifaof_ifpforaddr(ifa->ifa_addr, info.rti_ifp); - if (rti_ifa != NULL) - ifa_ref(rti_ifa); - info.rti_ifa = rti_ifa; - } + if (info.rti_ifp != NULL) + info.rti_ifa = ifaof_ifpforaddr(ifa->ifa_addr, info.rti_ifp); } info.rti_flags = ifa->ifa_flags | RTF_HOST | RTF_STATIC | RTF_PINNED; info.rti_info[RTAX_DST] = ia; @@ -168,9 +163,6 @@ ifa_maintain_loopback_route(int cmd, const char *otype, struct ifaddr *ifa, error = rib_action(ifp->if_fib, cmd, &info, &rc); NET_EPOCH_EXIT(et); - if (rti_ifa != NULL) - ifa_free(rti_ifa); - if (error == 0 || (cmd == RTM_ADD && error == EEXIST) || (cmd == RTM_DELETE && (error == ENOENT || error == ESRCH)))