From nobody Wed Aug  6 10:28:12 2025
X-Original-To: net@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4bxmj43c79z63XRb
	for <net@mlmmj.nyi.freebsd.org>; Wed, 06 Aug 2025 10:28:24 +0000 (UTC)
	(envelope-from vadimnuclight@gmail.com)
Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f])
	(using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "smtp.gmail.com", Issuer "WR4" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4bxmj41dlRz3Qxh
	for <net@freebsd.org>; Wed, 06 Aug 2025 10:28:24 +0000 (UTC)
	(envelope-from vadimnuclight@gmail.com)
Authentication-Results: mx1.freebsd.org;
	none
Received: by mail-lf1-x12f.google.com with SMTP id 2adb3069b0e04-5561ab55c4dso7095593e87.2
        for <net@freebsd.org>; Wed, 06 Aug 2025 03:28:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20230601; t=1754476097; x=1755080897; darn=freebsd.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:from:to:cc:subject:date
         :message-id:reply-to;
        bh=FGESJ3PQL9EZmyNshgYzc7dV4/+Vas0KeIlAsS5gi0A=;
        b=j+Zv+Jx8oPrMpPF+a+KwCw8AWMy7UlXnUth9nCqTGE3xvKsJxf31Pi3NSBooHKR7io
         ZpyH0gMY0C4qFkYdGB/icEBsUrM4Z+09cmS58C/GZJh8o+u5C/V8aQvHDVERYyC30YcP
         KjP42M1O/J+DGgQ95htzDuZEKVzeGex78kg89QZeh4EPO8Ws3BozxV9Mvy2SnW6ctxev
         atOwQlEhP8twfSikMhDeuxnpGY50NRT8brwHPeU9Puy/K6oS3Z4gihgA0eEUZF6jQMIc
         gjewv1IunmHOhEE0NX71jbgTd5uES6S8bS/0Pg1Zf0+urLS7a+vl4tY13Bvn0vkkFcmH
         sXnQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1754476097; x=1755080897;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=FGESJ3PQL9EZmyNshgYzc7dV4/+Vas0KeIlAsS5gi0A=;
        b=Cy66zXouO+tmQfcBSu1mqb2d4ah1hEIsvQ56porEbJklDNnldNDmsWuAej9j2F9D6A
         zg1IRkiAY8p0U1pngnHN5TtvNkyzPMCz/ozxlfDjSNmq3+CUuQoYynX9dSl56V1G6HBY
         A3BUhSaugMgGLLGv6AjXkNw8S9riK5+SAjgNuyZ7ZamQjwVSxqbZcYrRzaEkAlfgDZtz
         oJ6kztCloqz8RmPX3nI+gITLVwpwRBIfT7DxC2vpJVLVfkvkHCFslwjuAZSbAJG0NgX/
         TYamlGYCtuW6AS4MblrAttpwdu9jlvsFH58xdaGvEqxKS3KeN9rM1QzS/WBxW0JXvPch
         JwyA==
X-Gm-Message-State: AOJu0YzK99Yqir8RRzwSfm6Jvl5oDICsfJqAb47otky1H3NTn5X2qWNm
	MIALFGqzNSCydj8SDS6NoXLKR/yG5s1YqEhUwxglX/g564RmEs8q5aQ5Czg8xg==
X-Gm-Gg: ASbGnctb0dmdF7zKSrX1t3/8Wt6lE1yOLPv/MN9kFlRZAgyPR3Bk4SHQp2x0pqr2kJ/
	zg2YgPywEucdtHeDljeP2cNLhN9Mea6pLUa5odNcxgbpuLsTrdH80QXeINZgPuNvhzyh9J9zMTO
	6vzPvvxNEwBNo+0AbPAWSHETW/pLpL36fvNorfXKrQPMnQHJJiwYUVexfFDJ/rVessEnEDGS7UY
	KrGkpF3cY5EL+4DjU0xlBtTDVknAgD0hW9CqUlfBhooMWek1WMv3GUwMVyZ4gLny40JiGMUmSX8
	v1JNZbEudAUMbnbDDh0iq+fL6Mwmht9wcOjHdkHLDYoITIlzLqXM1ZeDA6akZKLzEA51qvvgYkt
	0KkLPKvbenAIw+ROZGG1NYlEuAkIayF+YSmFduqbuBMckX63vnLRwzgUR7Z8T0HgoObKbFATPQw
	f9
X-Google-Smtp-Source: AGHT+IFcHtL4r5SUHoPFZ2lgZ5nBM6HcuRaOxuc7R/R84ARocDL0U38nY21gnA7YoyzdJXiaZxnhew==
X-Received: by 2002:a05:6512:39d3:b0:55b:95b8:d7c with SMTP id 2adb3069b0e04-55caf37a132mr707720e87.52.1754476096762;
        Wed, 06 Aug 2025 03:28:16 -0700 (PDT)
Received: from nuclight.lan (broadband-77-37-180-76.ip.moscow.rt.ru. [77.37.180.76])
        by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-55b88ca3f94sm2302869e87.137.2025.08.06.03.28.16
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Wed, 06 Aug 2025 03:28:16 -0700 (PDT)
Date: Wed, 6 Aug 2025 13:28:12 +0300
From: Vadim Goncharov <vadimnuclight@gmail.com>
To: Kajetan Staszkiewicz <vegeta@tuxpowered.net>
Cc: net@FreeBSD.org
Subject: Re: adding fields to struct mbuf
Message-ID: <20250806132812.5d558652@nuclight.lan>
In-Reply-To: <41726ad6-9620-4902-bd66-5f1a0606becd@tuxpowered.net>
References: <e315d113-2a8a-4e17-8299-0e892a0b0ef3@tuxpowered.net>
	<20250731155550.529ce1fb@nuclight.lan>
	<41726ad6-9620-4902-bd66-5f1a0606becd@tuxpowered.net>
X-Mailer: Claws Mail 3.21.0 (GTK+ 2.24.33; amd64-portbld-freebsd13.4)
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-net
List-Help: <mailto:freebsd-net+help@freebsd.org>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Subscribe: <mailto:freebsd-net+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-net+unsubscribe@freebsd.org>
Sender: owner-freebsd-net@FreeBSD.org
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Rspamd-Queue-Id: 4bxmj41dlRz3Qxh
X-Spamd-Bar: ----
X-Rspamd-Pre-Result: action=no action;
	module=replies;
	Message is reply to one we originated
X-Spamd-Result: default: False [-4.00 / 15.00];
	REPLY(-4.00)[];
	ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US]

On Tue, 5 Aug 2025 12:35:58 +0200
Kajetan Staszkiewicz <vegeta@tuxpowered.net> wrote:

> On 2025-07-31 14:55, Vadim Goncharov wrote:
> > On Thu, 31 Jul 2025 13:03:22 +0200
> > Kajetan Staszkiewicz <vegeta@tuxpowered.net> wrote:
> >   
> >> Hello group,
> >>
> >> I'm researching loop prevention in pfil. There are cases where packets
> >> are reinjected into the network stack and would be handled by the same
> >> hooks again, i.e. pf + dummynet where currently pf itself handles loop
> >> prevention on its own. My current experiment's approach to making loop
> >> prevention a general, non-pf-specific thing is to create a new mtag with
> >> pointer to the last hook and update it in pfil.c/pfil_mbuf_common().
> >> That works good so far, but it means memory allocation when pfil hooks
> >> are involved. I'm unsure what the impact on performance would be.
> >> Another approach would be to extend struct mbuf, or probably rather
> >> struct m_pkthdr, to contain the aforementioned pointer. But is changing
> >> that struct something that can be easily done and approved and merged?  
> > 
> > First, you certainly don't need it in every mbuf - just first in chain with
> > struct pkthdr (where mtags also start).  
> 
> True.
> 
> > Second.
> > The "last hook ptr" does not look like general solution for all cases and
> > occupies 8 bytes. What about idea from network itself - TTL ? It occupies
> > less bytes, the main problem is to decide where to decrement (e.g. each
> > netgraph hook, etc.)  
> 
> The loop prevention I'm talking about is not as much about the packet
> looping through the network stack, but rather packet looping through
> pfil hooks. Consider those 2 scenarios:
> 
> 1. Dummynet reinjection, this is how it works in the current pf:
> a) A packet enters via ip6_input()
> b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to
> pf_test()
> c) pf sends it to dummynet configured for a delay pipe
> d) dummynet consumes the packet, pfil_mbuf_in()'s loop is interrupted
> e) later dummynet re-injects the packet using netisr_dispatch()
> f) the packet goes through ip6_input() and pfil_mbuf_in() again
> g) pf_check6_in()/pf_test() perform their own logic to determine that
> the packet has already went through pf_test()
> h) the packet continues through pfil_mbuf_in() and finally goes through
> ip6_(try)?forward and so on
> 
> In this case we could benefit from marking the packet/mbuf that it has
> already went through pf_check6_in() in pfil_mbuf_in()'s loop. When the
> loop is run again, all pfil hooks before and including pf_check_in6()
> can be skipped.

...unless some another pfil consumer decides that it *wants* to check that
packet via it's some preliminary checks. Or there are more than one filter and
somebody wants e.g. to start in pf and return from dummynet into ipfw...

> 2. af-to Address Family translation, the algorithm below is for the
> experimental pf code I've mentioned:
> a) A packet enters via ip6_input()
> b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to
> pf_test()
> c) pf_test() translates the packet from IPv6 to IPv4
> d) pf marks the packet as if it has went through pf_check_in() even
> though it has really went through pf_check6_in()
> e) the translated packet is sent through dummynet, as in the previous
> scenario
> f) dummynet reinjects the packet using netisr_dispatch()
> g) the packet goes through pfil_mbuf_in() and pf_check_in() is skipped

Again, this is specific for pf/dummynet, not stack in general. But general
mbuf fields should not be tied to specific subsystems - tags are more
appropriate here (unless you can fit into a bit or two like M_PROTO*).

For general pkthdr fields, I'd better be thinking of more general usage (to
include e.g. netgraph, yes) of TTL, refcounts...

> > Third.
> > What about redoing mtag allocator so that it reuses m_pktdat[] when M_EXT
> > is set? This could optimize performance for many tags, not just yours.  
> 
> I'm not sure I understand this idea. Storing mtags directly in m_pktdat?

Yes, though not quite: m_pktdat is member of union with m_ext, and tags could
be placed there only if m_ext *is* valid (otherwise m_pktdat is not unused),
that is, at m_pktdat[sizeof(m_ext) aligned to 8]. Also many functions need to
be teached to handle this, e.g. to do reallocation of tags from that area if
first mbuf collapses from m_ext to using m_pktdat.

And allocation of tags should be measured to determine if such changes are
really worth the performance.


-- 
WBR, @nuclight