From owner-svn-src-all@freebsd.org Mon Oct 21 08:48:48 2019 Return-Path: Delivered-To: svn-src-all@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 49345150A47; Mon, 21 Oct 2019 08:48:48 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 46xVfm1ChXz3xyp; Mon, 21 Oct 2019 08:48:48 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0CEC921794; Mon, 21 Oct 2019 08:48:48 +0000 (UTC) (envelope-from bz@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id x9L8mlP0033780; Mon, 21 Oct 2019 08:48:47 GMT (envelope-from bz@FreeBSD.org) Received: (from bz@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id x9L8mlLm033778; Mon, 21 Oct 2019 08:48:47 GMT (envelope-from bz@FreeBSD.org) Message-Id: <201910210848.x9L8mlLm033778@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: bz set sender to bz@FreeBSD.org using -f From: "Bjoern A. Zeeb" Date: Mon, 21 Oct 2019 08:48:47 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r353793 - head/sys/netinet6 X-SVN-Group: head X-SVN-Commit-Author: bz X-SVN-Commit-Paths: head/sys/netinet6 X-SVN-Commit-Revision: 353793 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Oct 2019 08:48:48 -0000 Author: bz Date: Mon Oct 21 08:48:47 2019 New Revision: 353793 URL: https://svnweb.freebsd.org/changeset/base/353793 Log: frag6: fix vnet teardown leak When shutting down a VNET we did not cleanup the fragmentation hashes. This has multiple problems: (1) leak memory but also (2) leak on the global counters, which might eventually lead to a problem on a system starting and stopping a lot of vnets and dealing with a lot of IPv6 fragments that the counters/limits would be exhausted and processing would no longer take place. Unfortunately we do not have a useable variable to indicate when per-VNET initialization of frag6 has happened (or when destroy happened) so introduce a boolean to flag this. This is needed here as well as it was in r353635 for ip_reass.c in order to avoid tripping over the already destroyed locks if interfaces go away after the frag6 destroy. While splitting things up convert the TRY_LOCK to a LOCK operation in now frag6_drain_one(). The try-lock was derived from a manual hand-rolled implementation and carried forward all the time. We no longer can afford not to get the lock as that would mean we would continue to leak memory. Assert that all the buckets are empty before destroying to lock to ensure long-term stability of a clean shutdown. Reported by: hselasky Reviewed by: hselasky MFC after: 3 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D22054 Modified: head/sys/netinet6/frag6.c head/sys/netinet6/ip6_input.c head/sys/netinet6/ip6_var.h Modified: head/sys/netinet6/frag6.c ============================================================================== --- head/sys/netinet6/frag6.c Mon Oct 21 08:36:15 2019 (r353792) +++ head/sys/netinet6/frag6.c Mon Oct 21 08:48:47 2019 (r353793) @@ -100,6 +100,12 @@ struct ip6asfrag { static MALLOC_DEFINE(M_FRAG6, "frag6", "IPv6 fragment reassembly header"); +#ifdef VIMAGE +/* A flag to indicate if IPv6 fragmentation is initialized. */ +VNET_DEFINE_STATIC(bool, frag6_on); +#define V_frag6_on VNET(frag6_on) +#endif + /* System wide (global) maximum and count of packets in reassembly queues. */ static int ip6_maxfrags; static volatile u_int frag6_nfrags = 0; @@ -289,6 +295,15 @@ frag6_cleanup(void *arg __unused, struct ifnet *ifp) KASSERT(ifp != NULL, ("%s: ifp is NULL", __func__)); +#ifdef VIMAGE + /* + * Skip processing if IPv6 reassembly is not initialised or + * torn down by frag6_destroy(). + */ + if (!V_frag6_on) + return; +#endif + CURVNET_SET_QUIET(ifp->if_vnet); for (i = 0; i < IP6REASS_NHASH; i++) { IP6QB_LOCK(i); @@ -929,6 +944,9 @@ frag6_init(void) } V_ip6qb_hashseed = arc4random(); V_ip6_maxfragsperpacket = 64; +#ifdef VIMAGE + V_frag6_on = true; +#endif if (!IS_DEFAULT_VNET(curvnet)) return; @@ -940,31 +958,57 @@ frag6_init(void) /* * Drain off all datagram fragments. */ +static void +frag6_drain_one(void) +{ + struct ip6q *head; + uint32_t bucket; + + for (bucket = 0; bucket < IP6REASS_NHASH; bucket++) { + IP6QB_LOCK(bucket); + head = IP6QB_HEAD(bucket); + while (head->ip6q_next != head) { + IP6STAT_INC(ip6s_fragdropped); + /* XXX in6_ifstat_inc(ifp, ifs6_reass_fail) */ + frag6_freef(head->ip6q_next, bucket); + } + IP6QB_UNLOCK(bucket); + } +} + void frag6_drain(void) { VNET_ITERATOR_DECL(vnet_iter); - struct ip6q *head; - uint32_t bucket; VNET_LIST_RLOCK_NOSLEEP(); VNET_FOREACH(vnet_iter) { CURVNET_SET(vnet_iter); - for (bucket = 0; bucket < IP6REASS_NHASH; bucket++) { - if (IP6QB_TRYLOCK(bucket) == 0) - continue; - head = IP6QB_HEAD(bucket); - while (head->ip6q_next != head) { - IP6STAT_INC(ip6s_fragdropped); - /* XXX in6_ifstat_inc(ifp, ifs6_reass_fail) */ - frag6_freef(head->ip6q_next, bucket); - } - IP6QB_UNLOCK(bucket); - } + frag6_drain_one(); CURVNET_RESTORE(); } VNET_LIST_RUNLOCK_NOSLEEP(); } + +#ifdef VIMAGE +/* + * Clear up IPv6 reassembly structures. + */ +void +frag6_destroy(void) +{ + uint32_t bucket; + + frag6_drain_one(); + V_frag6_on = false; + for (bucket = 0; bucket < IP6REASS_NHASH; bucket++) { + KASSERT(V_ip6qb[bucket].count == 0, + ("%s: V_ip6qb[%d] (%p) count not 0 (%d)", __func__, + bucket, &V_ip6qb[bucket], V_ip6qb[bucket].count)); + mtx_destroy(&V_ip6qb[bucket].lock); + } +} +#endif /* * Put an ip fragment on a reassembly chain. Modified: head/sys/netinet6/ip6_input.c ============================================================================== --- head/sys/netinet6/ip6_input.c Mon Oct 21 08:36:15 2019 (r353792) +++ head/sys/netinet6/ip6_input.c Mon Oct 21 08:48:47 2019 (r353793) @@ -393,6 +393,7 @@ ip6_destroy(void *unused __unused) } IFNET_RUNLOCK(); + frag6_destroy(); nd6_destroy(); in6_ifattach_destroy(); Modified: head/sys/netinet6/ip6_var.h ============================================================================== --- head/sys/netinet6/ip6_var.h Mon Oct 21 08:36:15 2019 (r353792) +++ head/sys/netinet6/ip6_var.h Mon Oct 21 08:48:47 2019 (r353793) @@ -392,6 +392,7 @@ int ip6_fragment(struct ifnet *, struct mbuf *, int, u int route6_input(struct mbuf **, int *, int); void frag6_init(void); +void frag6_destroy(void); int frag6_input(struct mbuf **, int *, int); void frag6_slowtimo(void); void frag6_drain(void);