From nobody Thu Aug 7 15:07:05 2025 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4byVr9398zz64Nfv; Thu, 07 Aug 2025 15:07:05 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4byVr921whz3K5S; Thu, 07 Aug 2025 15:07:05 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1754579225; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lfrx57r/JHQZhXftkXfaARimKEpGv39EpdacVa+Txdw=; b=LeEwehETjpTL7jKrjheG/LQMjGlKfECTh05lNHNRbggGXUVRg3JOyNjZBeUuPGIdWKCBMA dhkxBorl4SsOwgSQwYCqTOqHrdjDHgpJYH1YM5J9jT1cy4pLTh4qpf9zaCtSVuE2WFWWf0 eh4afLCdPMchgS5wmEPYlJx12di39KvjuO9C90i6OfsPU3UQ1tavfCr7ojzzVfWGakweUR 3b8s+bWro8bTczDlZYt4l2OeRRpIECiZmd0X2ufshkLNf9XqeukKvFDzgXb5L4psSq7PyM Wo8D7WsWb4n4L3c5ZQYQ9tMWcc98aM2RZelXA/2/AFMaMaWo9X5qTX/S8Cl5GA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1754579225; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lfrx57r/JHQZhXftkXfaARimKEpGv39EpdacVa+Txdw=; b=bqnwO+a9A8PgV6X6JaohBe5kJDfr3kgcJ6CKO1AHHah6KcnqhYH02JojVgnndyRB8vzZfC chEIZb/j4Oxb7dEmQcCwcFqyYYJXBiQ4EhUzGt7bLGrBW3MQN7S8gNUbmwJb1AQNFuZ4o2 sPNYyXpVi2Yz11YNixFMA8sAkN70WcuzxtEFv/vRnxUdHacHEpX8GpBKN009h6SYt8oI0O b1GE8T8IcddJvexAXU9ZbAbqf3UKXZoIKrXYIs2TfwsbWX5eyUo5YaK6dZCVuEhtHOIJaH sKWFxMOwG+dONgPwOepr8cEgYbNxf/0lSB5kP/tNYDPPPdLnxOcy13mqm1l59A== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1754579225; a=rsa-sha256; cv=none; b=BqY7Kt5VqwmpNmlZR9X8qccUVRJtcebevlSO7XuTb9OxihiRsVhZldtRxA/bvtOctMNjrh iaViYjNxDog33ceE6abhWrHIYwbtSgnTdF9im0E1ELzUlRhtHhBABRjPLefiBiRgdYsb/v 2LblgzlVhpOmam3qc7a/2DR3ENonjootC8GV2HEeR5nfgYjio+/SfHz4rT6LaxrM4IO0y9 ti43aNMeouK/Z+VUhRigWNYLkd+FQFy4M4XQ5gHgVJ/ZrtG//2pSdksJJrMmW7w3h4WQMW jSZViYWkgW3Q7Is6Vj/ibkSEIhM2hlfYrnP7xocEleXFhgQ2YtBnYIIgR6RFTg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4byVr91Z22z19mb; Thu, 07 Aug 2025 15:07:05 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 577F75di006902; Thu, 7 Aug 2025 15:07:05 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 577F75fg006899; Thu, 7 Aug 2025 15:07:05 GMT (envelope-from git) Date: Thu, 7 Aug 2025 15:07:05 GMT Message-Id: <202508071507.577F75fg006899@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Andrew Gallatin Subject: git: c224b2ce7de0 - main - iflib: don't pullup UDP payloads to the TCP header size List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: gallatin X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: c224b2ce7de0faa28ea87edf6e74de0e4e9d33f9 Auto-Submitted: auto-generated The branch main has been updated by gallatin: URL: https://cgit.FreeBSD.org/src/commit/?id=c224b2ce7de0faa28ea87edf6e74de0e4e9d33f9 commit c224b2ce7de0faa28ea87edf6e74de0e4e9d33f9 Author: Andrew Gallatin AuthorDate: 2025-08-07 14:48:45 +0000 Commit: Andrew Gallatin CommitDate: 2025-08-07 14:55:16 +0000 iflib: don't pullup UDP payloads to the TCP header size The IPv4 packet parsing logic in iflib is incredibly complex, prematurely optimized, and believes all the world is TCP. This causes it to pullup part of the UDP payload into the packet header, causing unneeded memory copies. This impacts a project I'm working on, and also impacts nearly any kernel user of UDP, like NFS. Eg, NFS over UDP will result in pullups for every datagram sent over an iflib NIC. This patch: - adds parsing for UDP to iflib - attempts to pull up the correct header size, based on UDP or TCP protocol type. - simplifies packet parsing in iflib by - no longer special casing having an ethernet header in a packet by itself - no longer checking that we're trying to pullup something beyond the end of the packet. Since we're no longer trying to pull up a larger TCP header, attempting to pullup something larger than the packet should no longer happen. If it does, the packet is malformed and m_pullup will return an error when it runs out of data in the mbuf chain Reviewed by: erj, glebius, kbowling Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D51748 --- sys/net/iflib.c | 63 ++++++++++++++++++++------------------------------------- 1 file changed, 22 insertions(+), 41 deletions(-) diff --git a/sys/net/iflib.c b/sys/net/iflib.c index 2b8f0e617df3..2eca81d54f99 100644 --- a/sys/net/iflib.c +++ b/sys/net/iflib.c @@ -70,6 +70,7 @@ #include #include #include +#include #include #include @@ -3372,42 +3373,28 @@ iflib_parse_header(iflib_txq_t txq, if_pkt_info_t pi, struct mbuf **mp) #ifdef INET case ETHERTYPE_IP: { - struct mbuf *n; - struct ip *ip = NULL; - struct tcphdr *th = NULL; - int minthlen; + struct ip *ip; + struct tcphdr *th; + uint8_t hlen; - minthlen = min(m->m_pkthdr.len, pi->ipi_ehdrlen + sizeof(*ip) + sizeof(*th)); - if (__predict_false(m->m_len < minthlen)) { - /* - * if this code bloat is causing too much of a hit - * move it to a separate function and mark it noinline - */ - if (m->m_len == pi->ipi_ehdrlen) { - n = m->m_next; - MPASS(n); - if (n->m_len >= sizeof(*ip)) { - ip = (struct ip *)n->m_data; - if (n->m_len >= (ip->ip_hl << 2) + sizeof(*th)) - th = (struct tcphdr *)((caddr_t)ip + (ip->ip_hl << 2)); - } else { - txq->ift_pullups++; - if (__predict_false((m = m_pullup(m, minthlen)) == NULL)) - return (ENOMEM); - ip = (struct ip *)(m->m_data + pi->ipi_ehdrlen); - } - } else { - txq->ift_pullups++; - if (__predict_false((m = m_pullup(m, minthlen)) == NULL)) - return (ENOMEM); - ip = (struct ip *)(m->m_data + pi->ipi_ehdrlen); - if (m->m_len >= (ip->ip_hl << 2) + sizeof(*th)) - th = (struct tcphdr *)((caddr_t)ip + (ip->ip_hl << 2)); - } - } else { - ip = (struct ip *)(m->m_data + pi->ipi_ehdrlen); - if (m->m_len >= (ip->ip_hl << 2) + sizeof(*th)) - th = (struct tcphdr *)((caddr_t)ip + (ip->ip_hl << 2)); + hlen = pi->ipi_ehdrlen + sizeof(*ip); + if (__predict_false(m->m_len < hlen)) { + txq->ift_pullups++; + if (__predict_false((m = m_pullup(m, hlen)) == NULL)) + return (ENOMEM); + } + ip = (struct ip *)(m->m_data + pi->ipi_ehdrlen); + hlen = pi->ipi_ehdrlen + (ip->ip_hl << 2); + if (ip->ip_p == IPPROTO_TCP) { + hlen += sizeof(*th); + th = (struct tcphdr *)((char *)ip + (ip->ip_hl << 2)); + } else if (ip->ip_p == IPPROTO_UDP) { + hlen += sizeof(struct udphdr); + } + if (__predict_false(m->m_len < hlen)) { + txq->ift_pullups++; + if ((m = m_pullup(m, hlen)) == NULL) + return (ENOMEM); } pi->ipi_ip_hlen = ip->ip_hl << 2; pi->ipi_ipproto = ip->ip_p; @@ -3417,12 +3404,6 @@ iflib_parse_header(iflib_txq_t txq, if_pkt_info_t pi, struct mbuf **mp) /* TCP checksum offload may require TCP header length */ if (IS_TX_OFFLOAD4(pi)) { if (__predict_true(pi->ipi_ipproto == IPPROTO_TCP)) { - if (__predict_false(th == NULL)) { - txq->ift_pullups++; - if (__predict_false((m = m_pullup(m, (ip->ip_hl << 2) + sizeof(*th))) == NULL)) - return (ENOMEM); - th = (struct tcphdr *)((caddr_t)ip + pi->ipi_ip_hlen); - } pi->ipi_tcp_hflags = tcp_get_flags(th); pi->ipi_tcp_hlen = th->th_off << 2; pi->ipi_tcp_seq = th->th_seq;