From nobody Thu Jun 1 07:08:45 2023 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QWy0Y6zxqz4Xxn3; Thu, 1 Jun 2023 07:08:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4QWy0Y56mWz3hbk; Thu, 1 Jun 2023 07:08:45 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1685603325; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=57q27r1e1kAMyRkLXcVcSbfWdZ4IR23vjCLdQwrP9Q8=; b=tSJA9o4+oi2Lm1FYn75ZSWuKBxACanZc3G5QKtyiwDyhIhePtgbhw+ql/OcEqiJ8UK+YCF XNg12WuWV36Hg5LWIlE7xCE67e8znGsFRpE9DtK3uHQOJqL4Ha4SlMVo/VsGNjBGEglfyi NoEdbYGtJLUk21CX0c+3wjgX/qT2bI+rD095zZc2uviBKAASnDeFgDFHzLPbzIJyUgDeL8 rAXj8YY5zQmKmfV0y4ysjvb/oIH+/2F4S6YoEzsc0b4HMe8ExTTN6+TtBmreLuUKMneL0z HxcuGBSmL+S38eDZ09FxoOvRnLpBCvXLLd/h6POkSISkrISDzLFW/+cXnsRLfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1685603325; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=57q27r1e1kAMyRkLXcVcSbfWdZ4IR23vjCLdQwrP9Q8=; b=wMw+WO9fpE5SOp3ztjnoxdJmq4i71EM5fypiflyAoYvjjPo/JCLD6o4oRskA05Q7l4sw7t zYR+uOM9GG+HJwdUEvCrpa2WsQX3unm/agoBgSb3/SetKB1TU3jnqp873N+25ZgJD6XRWc PkstruMZPeapger2/+KFs8Ve4xoJAtQgCMT1rYX1KNph0U3UB9T13U22JnGQUNDrwdXt4x jNnI6NE1YPMx5BsBI1kCeNEfuoVxfDJyOqcXAId7Oi9HBEEPKhIRZKvbzqE/8HAc4ZyHhn qqshPVEYPZClhArf//Wlh6VSwxB93RUqswFAbKZZzBpQRbhXgtySXcKiy7UrbA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1685603325; a=rsa-sha256; cv=none; b=rwx7sJSrbwHUNpKe8aeMVONFT/6OiDEkYq4ovsvK5bc1ok80XfXXg0DC7oECoMKLQ5p8/1 pnZHrzRoW01upI9pZoo1PSHMHA03G4xgIFCknb3vFlJt3trk5JH3aN43chxXhskhAPAKjo JVYo/b/Ae8CKYesvUkuC/ewGJlTcndiiYNb8jHqByLLP0MlFklvrZISE8NGW4EdPImEpZH dBrn0i2Uep95pzYAEXlTU/lVth+Ays9yWjhkIcg28b6FpMh+aZpksLyoX9vYgIpCTw4FSx Zmy5U+zedu9J4avw2NowidCDWTOe5/ITmJO5LJf0ht2yhbyzvgOBBCU0trNcdA== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4QWy0Y4DFmz170X; Thu, 1 Jun 2023 07:08:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 35178jmg076674; Thu, 1 Jun 2023 07:08:45 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 35178juk076673; Thu, 1 Jun 2023 07:08:45 GMT (envelope-from git) Date: Thu, 1 Jun 2023 07:08:45 GMT Message-Id: <202306010708.35178juk076673@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: "Alexander V. Chernikov" Subject: git: d18715475071 - main - netlink: use custom uma zone for the mbuf storage. List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-main@freebsd.org X-BeenThere: dev-commits-src-main@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: melifaro X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: d187154750711c6c3bfd4feb573b2ad26de29bf2 Auto-Submitted: auto-generated X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by melifaro: URL: https://cgit.FreeBSD.org/src/commit/?id=d187154750711c6c3bfd4feb573b2ad26de29bf2 commit d187154750711c6c3bfd4feb573b2ad26de29bf2 Author: Alexander V. Chernikov AuthorDate: 2023-05-31 18:02:49 +0000 Commit: Alexander V. Chernikov CommitDate: 2023-06-01 06:43:39 +0000 netlink: use custom uma zone for the mbuf storage. Netlink communicates with userland via sockets, utilising MCLBYTES-sized mbufs to append data to the socket buffers. These mbufs are never transmitted via logical or physical network. It may be possible that the 2k mbuf zone is temporary exhausted due to the DDoS-style traffic, leading to Netlink failure to respond to the requests. To address it, this change introduces a custom Netlink-specific zone for the mbuf storage. It has the following benefits: * no precious memory from UMA_ZONE_CONTIG zones is utilized for Netlink * Netlink becomes (more) independent from the traffic spikes and other related network "corner" conditions. * Netlink allocations are now isolated within a specific zone, making it easier to track Netlink mbuf usage and attribute mbufs. Reviewed by: gallatin, adrian Differential Revision: https://reviews.freebsd.org/D40356 MFC after: 2 weeks --- sys/netlink/netlink_message_writer.c | 78 ++++++++++++++++++++++++++++++++---- sys/netlink/netlink_module.c | 2 + sys/netlink/netlink_var.h | 10 ++++- 3 files changed, 81 insertions(+), 9 deletions(-) diff --git a/sys/netlink/netlink_message_writer.c b/sys/netlink/netlink_message_writer.c index f885b88702ee..841bdb2d5c0b 100644 --- a/sys/netlink/netlink_message_writer.c +++ b/sys/netlink/netlink_message_writer.c @@ -53,7 +53,7 @@ _DECLARE_DEBUG(LOG_INFO); * The goal of this file is to provide convenient message writing KPI on top of * different storage methods (mbufs, uio, temporary memory chunks). * - * The main KPI guarantee is the the (last) message always resides in the contiguous + * The main KPI guarantee is that the (last) message always resides in the contiguous * memory buffer, so one is able to update the header after writing the entire message. * * This guarantee comes with a side effect of potentially reallocating underlying @@ -79,6 +79,71 @@ _DECLARE_DEBUG(LOG_INFO); * change. It happens transparently to the caller. */ +/* + * Uma zone for the mbuf-based Netlink storage + */ +static uma_zone_t nlmsg_zone; + +static void +nl_free_mbuf_storage(struct mbuf *m) +{ + uma_zfree(nlmsg_zone, m->m_ext.ext_buf); +} + +static int +nl_setup_mbuf_storage(void *mem, int size, void *arg, int how __unused) +{ + struct mbuf *m = (struct mbuf *)arg; + + if (m != NULL) + m_extadd(m, mem, size, nl_free_mbuf_storage, NULL, NULL, 0, EXT_MOD_TYPE); + + return (0); +} + +static struct mbuf * +nl_get_mbuf_flags(int size, int malloc_flags, int mbuf_flags) +{ + struct mbuf *m, *m_storage; + + if (size <= MHLEN) + return (m_get2(size, malloc_flags, MT_DATA, mbuf_flags)); + + if (__predict_false(size > NLMBUFSIZE)) + return (NULL); + + m = m_gethdr(malloc_flags, MT_DATA); + if (m == NULL) + return (NULL); + + m_storage = uma_zalloc_arg(nlmsg_zone, m, malloc_flags); + if (m_storage == NULL) { + m_free_raw(m); + return (NULL); + } + + return (m); +} + +static struct mbuf * +nl_get_mbuf(int size, int malloc_flags) +{ + return (nl_get_mbuf_flags(size, malloc_flags, M_PKTHDR)); +} + +void +nl_init_msg_zone(void) +{ + nlmsg_zone = uma_zcreate("netlink", NLMBUFSIZE, nl_setup_mbuf_storage, + NULL, NULL, NULL, UMA_ALIGN_PTR, 0); +} + +void +nl_destroy_msg_zone(void) +{ + uma_zdestroy(nlmsg_zone); +} + typedef bool nlwriter_op_init(struct nl_writer *nw, int size, bool waitok); typedef bool nlwriter_op_write(struct nl_writer *nw, void *buf, int buflen, int cnt); @@ -196,17 +261,16 @@ nlmsg_write_chain_buf(struct nl_writer *nw, void *buf, int datalen, int cnt) * This is the most efficient mechanism as it avoids double-copying. * * Allocates a single mbuf suitable to store up to @size bytes of data. - * If size < MHLEN (around 160 bytes), allocates mbuf with pkghdr - * If size <= MCLBYTES (2k), allocate a single mbuf cluster - * Otherwise, return NULL. + * If size < MHLEN (around 160 bytes), allocates mbuf with pkghdr. + * If the size <= NLMBUFSIZE (2k), allocate mbuf+storage out of nlmsg_zone. + * Returns NULL on greater size or the allocation failure. */ static bool nlmsg_get_ns_mbuf(struct nl_writer *nw, int size, bool waitok) { - struct mbuf *m; - int mflag = waitok ? M_WAITOK : M_NOWAIT; - m = m_get2(size, mflag, MT_DATA, M_PKTHDR); + struct mbuf *m = nl_get_mbuf(size, mflag); + if (__predict_false(m == NULL)) return (false); nw->alloc_len = M_TRAILINGSPACE(m); diff --git a/sys/netlink/netlink_module.c b/sys/netlink/netlink_module.c index 08cd08600af3..81b3c6d8e756 100644 --- a/sys/netlink/netlink_module.c +++ b/sys/netlink/netlink_module.c @@ -223,6 +223,7 @@ netlink_modevent(module_t mod __unused, int what, void *priv __unused) switch (what) { case MOD_LOAD: NL_LOG(LOG_DEBUG2, "Loading"); + nl_init_msg_zone(); nl_osd_register(); #if !defined(NETLINK) && defined(NETLINK_MODULE) nl_set_functions(&nl_module); @@ -238,6 +239,7 @@ netlink_modevent(module_t mod __unused, int what, void *priv __unused) nl_set_functions(NULL); #endif nl_osd_unregister(); + nl_destroy_msg_zone(); } else ret = EBUSY; break; diff --git a/sys/netlink/netlink_var.h b/sys/netlink/netlink_var.h index 8c714cda4fdc..a26d217f4023 100644 --- a/sys/netlink/netlink_var.h +++ b/sys/netlink/netlink_var.h @@ -36,8 +36,10 @@ #include #include -#define NLSNDQ 65536 /* Default socket sendspace */ -#define NLRCVQ 65536 /* Default socket recvspace */ +#define NLSNDQ 65536 /* Default socket sendspace */ +#define NLRCVQ 65536 /* Default socket recvspace */ + +#define NLMBUFSIZE 2048 /* External storage size for Netlink mbufs */ struct ucred; @@ -152,6 +154,10 @@ void nl_process_receive_locked(struct nlpcb *nlp); void nl_set_source_metadata(struct mbuf *m, int num_messages); void nl_add_msg_info(struct mbuf *m); +/* netlink_message_writer.c */ +void nl_init_msg_zone(void); +void nl_destroy_msg_zone(void); + /* netlink_generic.c */ struct genl_family { const char *family_name;