From nobody Wed Aug 6 10:28:12 2025 X-Original-To: net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bxmj43c79z63XRb for ; Wed, 06 Aug 2025 10:28:24 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4bxmj41dlRz3Qxh for ; Wed, 06 Aug 2025 10:28:24 +0000 (UTC) (envelope-from vadimnuclight@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x12f.google.com with SMTP id 2adb3069b0e04-5561ab55c4dso7095593e87.2 for ; Wed, 06 Aug 2025 03:28:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754476097; x=1755080897; darn=freebsd.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=FGESJ3PQL9EZmyNshgYzc7dV4/+Vas0KeIlAsS5gi0A=; b=j+Zv+Jx8oPrMpPF+a+KwCw8AWMy7UlXnUth9nCqTGE3xvKsJxf31Pi3NSBooHKR7io ZpyH0gMY0C4qFkYdGB/icEBsUrM4Z+09cmS58C/GZJh8o+u5C/V8aQvHDVERYyC30YcP KjP42M1O/J+DGgQ95htzDuZEKVzeGex78kg89QZeh4EPO8Ws3BozxV9Mvy2SnW6ctxev atOwQlEhP8twfSikMhDeuxnpGY50NRT8brwHPeU9Puy/K6oS3Z4gihgA0eEUZF6jQMIc gjewv1IunmHOhEE0NX71jbgTd5uES6S8bS/0Pg1Zf0+urLS7a+vl4tY13Bvn0vkkFcmH sXnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754476097; x=1755080897; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FGESJ3PQL9EZmyNshgYzc7dV4/+Vas0KeIlAsS5gi0A=; b=Cy66zXouO+tmQfcBSu1mqb2d4ah1hEIsvQ56porEbJklDNnldNDmsWuAej9j2F9D6A zg1IRkiAY8p0U1pngnHN5TtvNkyzPMCz/ozxlfDjSNmq3+CUuQoYynX9dSl56V1G6HBY A3BUhSaugMgGLLGv6AjXkNw8S9riK5+SAjgNuyZ7ZamQjwVSxqbZcYrRzaEkAlfgDZtz oJ6kztCloqz8RmPX3nI+gITLVwpwRBIfT7DxC2vpJVLVfkvkHCFslwjuAZSbAJG0NgX/ TYamlGYCtuW6AS4MblrAttpwdu9jlvsFH58xdaGvEqxKS3KeN9rM1QzS/WBxW0JXvPch JwyA== X-Gm-Message-State: AOJu0YzK99Yqir8RRzwSfm6Jvl5oDICsfJqAb47otky1H3NTn5X2qWNm MIALFGqzNSCydj8SDS6NoXLKR/yG5s1YqEhUwxglX/g564RmEs8q5aQ5Czg8xg== X-Gm-Gg: ASbGnctb0dmdF7zKSrX1t3/8Wt6lE1yOLPv/MN9kFlRZAgyPR3Bk4SHQp2x0pqr2kJ/ zg2YgPywEucdtHeDljeP2cNLhN9Mea6pLUa5odNcxgbpuLsTrdH80QXeINZgPuNvhzyh9J9zMTO 6vzPvvxNEwBNo+0AbPAWSHETW/pLpL36fvNorfXKrQPMnQHJJiwYUVexfFDJ/rVessEnEDGS7UY KrGkpF3cY5EL+4DjU0xlBtTDVknAgD0hW9CqUlfBhooMWek1WMv3GUwMVyZ4gLny40JiGMUmSX8 v1JNZbEudAUMbnbDDh0iq+fL6Mwmht9wcOjHdkHLDYoITIlzLqXM1ZeDA6akZKLzEA51qvvgYkt 0KkLPKvbenAIw+ROZGG1NYlEuAkIayF+YSmFduqbuBMckX63vnLRwzgUR7Z8T0HgoObKbFATPQw f9 X-Google-Smtp-Source: AGHT+IFcHtL4r5SUHoPFZ2lgZ5nBM6HcuRaOxuc7R/R84ARocDL0U38nY21gnA7YoyzdJXiaZxnhew== X-Received: by 2002:a05:6512:39d3:b0:55b:95b8:d7c with SMTP id 2adb3069b0e04-55caf37a132mr707720e87.52.1754476096762; Wed, 06 Aug 2025 03:28:16 -0700 (PDT) Received: from nuclight.lan (broadband-77-37-180-76.ip.moscow.rt.ru. [77.37.180.76]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-55b88ca3f94sm2302869e87.137.2025.08.06.03.28.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 03:28:16 -0700 (PDT) Date: Wed, 6 Aug 2025 13:28:12 +0300 From: Vadim Goncharov To: Kajetan Staszkiewicz Cc: net@FreeBSD.org Subject: Re: adding fields to struct mbuf Message-ID: <20250806132812.5d558652@nuclight.lan> In-Reply-To: <41726ad6-9620-4902-bd66-5f1a0606becd@tuxpowered.net> References: <20250731155550.529ce1fb@nuclight.lan> <41726ad6-9620-4902-bd66-5f1a0606becd@tuxpowered.net> X-Mailer: Claws Mail 3.21.0 (GTK+ 2.24.33; amd64-portbld-freebsd13.4) List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4bxmj41dlRz3Qxh X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] On Tue, 5 Aug 2025 12:35:58 +0200 Kajetan Staszkiewicz wrote: > On 2025-07-31 14:55, Vadim Goncharov wrote: > > On Thu, 31 Jul 2025 13:03:22 +0200 > > Kajetan Staszkiewicz wrote: > > > >> Hello group, > >> > >> I'm researching loop prevention in pfil. There are cases where packets > >> are reinjected into the network stack and would be handled by the same > >> hooks again, i.e. pf + dummynet where currently pf itself handles loop > >> prevention on its own. My current experiment's approach to making loop > >> prevention a general, non-pf-specific thing is to create a new mtag with > >> pointer to the last hook and update it in pfil.c/pfil_mbuf_common(). > >> That works good so far, but it means memory allocation when pfil hooks > >> are involved. I'm unsure what the impact on performance would be. > >> Another approach would be to extend struct mbuf, or probably rather > >> struct m_pkthdr, to contain the aforementioned pointer. But is changing > >> that struct something that can be easily done and approved and merged? > > > > First, you certainly don't need it in every mbuf - just first in chain with > > struct pkthdr (where mtags also start). > > True. > > > Second. > > The "last hook ptr" does not look like general solution for all cases and > > occupies 8 bytes. What about idea from network itself - TTL ? It occupies > > less bytes, the main problem is to decide where to decrement (e.g. each > > netgraph hook, etc.) > > The loop prevention I'm talking about is not as much about the packet > looping through the network stack, but rather packet looping through > pfil hooks. Consider those 2 scenarios: > > 1. Dummynet reinjection, this is how it works in the current pf: > a) A packet enters via ip6_input() > b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to > pf_test() > c) pf sends it to dummynet configured for a delay pipe > d) dummynet consumes the packet, pfil_mbuf_in()'s loop is interrupted > e) later dummynet re-injects the packet using netisr_dispatch() > f) the packet goes through ip6_input() and pfil_mbuf_in() again > g) pf_check6_in()/pf_test() perform their own logic to determine that > the packet has already went through pf_test() > h) the packet continues through pfil_mbuf_in() and finally goes through > ip6_(try)?forward and so on > > In this case we could benefit from marking the packet/mbuf that it has > already went through pf_check6_in() in pfil_mbuf_in()'s loop. When the > loop is run again, all pfil hooks before and including pf_check_in6() > can be skipped. ...unless some another pfil consumer decides that it *wants* to check that packet via it's some preliminary checks. Or there are more than one filter and somebody wants e.g. to start in pf and return from dummynet into ipfw... > 2. af-to Address Family translation, the algorithm below is for the > experimental pf code I've mentioned: > a) A packet enters via ip6_input() > b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to > pf_test() > c) pf_test() translates the packet from IPv6 to IPv4 > d) pf marks the packet as if it has went through pf_check_in() even > though it has really went through pf_check6_in() > e) the translated packet is sent through dummynet, as in the previous > scenario > f) dummynet reinjects the packet using netisr_dispatch() > g) the packet goes through pfil_mbuf_in() and pf_check_in() is skipped Again, this is specific for pf/dummynet, not stack in general. But general mbuf fields should not be tied to specific subsystems - tags are more appropriate here (unless you can fit into a bit or two like M_PROTO*). For general pkthdr fields, I'd better be thinking of more general usage (to include e.g. netgraph, yes) of TTL, refcounts... > > Third. > > What about redoing mtag allocator so that it reuses m_pktdat[] when M_EXT > > is set? This could optimize performance for many tags, not just yours. > > I'm not sure I understand this idea. Storing mtags directly in m_pktdat? Yes, though not quite: m_pktdat is member of union with m_ext, and tags could be placed there only if m_ext *is* valid (otherwise m_pktdat is not unused), that is, at m_pktdat[sizeof(m_ext) aligned to 8]. Also many functions need to be teached to handle this, e.g. to do reallocation of tags from that area if first mbuf collapses from m_ext to using m_pktdat. And allocation of tags should be measured to determine if such changes are really worth the performance. -- WBR, @nuclight