Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 31 Aug 2022 01:42:16 +0000
From:      tt78347 <tt78347@protonmail.com>
To:        Lutz Donnerhacke <lutz@donnerhacke.de>
Cc:        "freebsd-net@freebsd.org" <freebsd-net@freebsd.org>
Subject:   Re: IPFW NAT intermittently fails to redirect UDP packets; seeking DTrace scripts or other advice
Message-ID:  <HhQgyIwbcYuEmalaD76MpikeMKmy7_yGN0V3iFPeY2g9TlDmttL9pDKMNRd6ScLpfkoydEj0mHDLsIJQfKUoJF9OwdYwap6Ui5v7MtwdJjg=@protonmail.com>
In-Reply-To: <20220830161145.GA31694@belenus.iks-jena.de>
References:  <gg17I_Npe7ROH1jMb1q1NImxP-WeYJ1Onu-QT6OKzybIsUP1GLxQyhTqHXO6rqTSJlI9t776Kb_cfCdps8xH5aaSWxTerm8MCaG2qb0i770=@protonmail.com> <20220830161145.GA31694@belenus.iks-jena.de>

next in thread | previous in thread | raw e-mail | index | archive | help



=20
> Only a quick look ...
>=20
> There is no guarantee, that the ports of the UDP packets are not modified=
 by
> libalias (NAT is designed to do exactly this modification). So some of th=
e
> matches seems to be a bit optimistic,
>=20
> > - This system has net.inet.ip.fw.one_pass=3D0
>=20
>=20
> man ipfw
> To let the packet continue after being (de)aliased, set the sysctl
> variable net.inet.ip.fw.one_pass to 0. For more information about
> aliasing modes, refer to libalias(3).
>=20
> Hence the NAT is applied multiple times if the path through the rules is =
a
> bit unlucky.
>=20
Thank you for your response.


Thanks for bringing up this point about ports. I had not thought about it.
However, I'm not sure exactly what you mean here. redirect_port should not=
=20
change the destination port of incoming packets, and if I am not mistaken, =
rule=20
452 should allow all relevant incoming packets through (after they have bee=
n=20
processed by NAT). Unless I have made a foolish error, rules 450-452 specif=
y=20
destination ports.

On the other hand, since we are forwarding it's true that incoming and outg=
oing=20
packets are evaluated by the firewall twice in this case, once at the exter=
nal=20
interface and once at bridge0 (or on the epair, I'm not sure which, I think=
 I've=20
seen both). I don't see how that could be causing an issue, since even when=
 the=20
packet it as the bridge, it should still match "via $extif", since "recv=20
$extif" is still true. So it would still match 450-452.=20


Though, I can't rule out that I have a major misunderstanding about how IPF=
W=20
works- it has happened before. In fact, as I do some further experimenting,=
 I'm=20
starting to doubt whether what I said above is correct.


>=20
> The traces show, that the problematic cases are those where the packets a=
re
> not (de)aliased. This can be the case, when libalias has no more free por=
ts
> available for aliasing. In such a case, the packet is returned unmodified
> (unaliased) with an error code. I'm not sure, if this will cause a packet
> drop or not, especially in the one_pass=3D0 case.
>=20
> It might be possible, that duplicate packets (or quickly repeated ones)
> trigger an unintended aliasing of the source port. This will create an fl=
ow
> in the NAT table which is handled before the port redirection. And it mig=
ht
> miss the rules with explicit port numbers.
>=20
> But this will be probably the wrong idea.


I am intrigued by this idea of unintended creation of NAT flows. It's not=
=20
something I am an expert in by any means. However, I do not think source po=
rts=20
are changing here under any circumstances, because I have never witnessed a=
=20
packet trace with any ports aside from 500 and 4500.

But, what you have said about NAT flows being inadvertently created is stil=
l=20
interesting, and I had not thought about it. It sounds like it could be a=
=20
factor. I will experiment further. Is there a good way to examine the conte=
nts=20
of this table?



I will also mention that while this overall setup was working properly prio=
r to=20
my upgrade to 12.3, I did not have rules 450-452 specified explicitly as I =
do=20
here. I had placed them here early on in an attempt to fix the issue.
Prior to the upgrade, all NAT was handled in 500-540.




More information / report on today's observations:
I'm not sure if any of this information is useful, but here it is in case i=
t=20
provides any clues.




This issue has actually been happening more frequently now that I've starte=
d to=20
experiment with it more and also after moving some traffic off of this host=
.
It happened again today, and I was actually able to start natd (previously =
I=20
had an error, but I've now invoked it in the foreground using -v). Specifyi=
ng a=20
divert rule for this natd instance on rule 445 fixed the issue, but only fo=
r=20
about 20 minutes. As I was experimenting to try to see if my rules were wro=
ng,
it started to work again, apparently not due to any experimental changes I =
had=20
made, since after eliminating these changes, it still continued to work as=
=20
expected.



I also witnessed something quite extraordinary and, to me, inexplicable. So=
=20
far, I've been talking about a specific host that has been having this prob=
lem.=20
I've referred to its IP address as 1.1.1.1, and for my packet traces, I've =
been=20
making reference to an external host whose packets often have this issue on=
=20
1.1.1.1, calling that external host 2.2.2.2; this is the external host whos=
e=20
packets are in the packet traces I posted.

Just now, as I was doing some experimentation on 1.1.1.1 as mentioned above=
,=20
the same issue was produced on 2.2.2.2 (with other hosts on my network), ev=
en=20
though 2.2.2.2 has not experienced this issue for 6 months or more. Inciden=
tally,=20
2.2.2.2 sends and receives far, far more UDP on 500,4500 than 1.1.1.1. The =
only=20
distinguishable thing that I did on 2.2.2.2 before the issue occurred was t=
o=20
try to initiate an outgoing IKE connection repeatedly to 1.1.1.1, as I was=
=20
experimenting on 1.1.1.1. But I can't imagine that this is the first time t=
hat=20
I've done that.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?HhQgyIwbcYuEmalaD76MpikeMKmy7_yGN0V3iFPeY2g9TlDmttL9pDKMNRd6ScLpfkoydEj0mHDLsIJQfKUoJF9OwdYwap6Ui5v7MtwdJjg=>