Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 8 Dec 2022 08:57:08 +0200
From:      John Hay <john@sanren.ac.za>
To:        ipfw@freebsd.org
Subject:   Re: ipfw nat and smaller wan mtu
Message-ID:  <CAGv8uarEWoV=C-xMvZzq5m-eCxuNa%2BVFZSyGHQXYyyHrz6xSkg@mail.gmail.com>
In-Reply-To: <CAGv8uap=f5_63b-F8AZsaP0ZW9GDuF5p56yojcY2%2BVSB9=x6gw@mail.gmail.com>
References:  <CAGv8uap=f5_63b-F8AZsaP0ZW9GDuF5p56yojcY2%2BVSB9=x6gw@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
--000000000000557d3e05ef4b8c8d
Content-Type: text/plain; charset="UTF-8"

Hi,

Adding this patch does make it work for me. There might be better ways to
do it. I have tested with ping and ssh. In ping's case, ping reported:
frag needed and DF set (MTU 1392)

In ssh's case I could see with tcpdump that the "need to frag (mtu 1392)"
was sent back and the next packet's length was adjusted.

#####
06:29:59.869677 IP (tos 0x48, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 1500)
    10.10.1.3.64344 > 10.10.7.7.22: Flags [.], cksum 0xb64d (correct), seq
39:1487, ack 39, win 1027, options [nop,nop,TS val 260430893 ecr
926374970], length 1448
06:29:59.869954 IP (tos 0x0, ttl 63, id 62454, offset 0, flags [none],
proto ICMP (1), length 596)
    10.10.2.2 > 10.10.1.3: ICMP 10.10.7.7 unreachable - need to frag (mtu
1392), length 576
IP (tos 0x48, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length
1500, bad cksum e081 (->19b7)!)
    10.10.1.3.64344 > 10.10.7.7.22: Flags [.], seq 39:1487, ack 39, win
1027, options [nop,nop,TS val 260430893 ecr 926374970], length 1448
06:29:59.871301 IP (tos 0x48, ttl 64, id 0, offset 0, flags [DF], proto TCP
(6), length 1392)
    10.10.1.3.64344 > 10.10.7.7.22: Flags [.], cksum 0x6841 (correct), seq
39:1379, ack 39, win 1027, options [nop,nop,TS val 260430893 ecr
926374970], length 1340
#####

--- sys/netinet/libalias/alias.c.orig   2022-05-12 04:54:03.000000000 +0000
+++ sys/netinet/libalias/alias.c        2022-12-08 05:42:25.127980000 +0000
@@ -365,6 +365,19 @@
                lnk = NULL;

        if (lnk != NULL) {
+               /*
+                   If the packet was locally generated, it will have a
+                   loopback address as source, which will not be handled
+                   correctly. For now use the destination address as source
+                   address. The correct source address might be the the
+                   interface address that the packet will be going out on.
+               */
+               if (IN_LOOPBACK(ntohl(pip->ip_src.s_addr)) &&
+                   !IN_LOOPBACK(ntohl(pip->ip_dst.s_addr))) {
+                       DifferentialChecksum(&pip->ip_sum,
+                           &pip->ip_dst, &pip->ip_src, 2);
+                       pip->ip_src = pip->ip_dst;
+               }
                if (ip->ip_p == IPPROTO_UDP || ip->ip_p == IPPROTO_TCP) {
                        int accumulate, accumulate2;
                        struct in_addr original_address;

On Wed, 7 Dec 2022 at 16:33, John Hay <john@sanren.ac.za> wrote:

> Hi,
>
> What would the proper ipfw rules be to make nat work and properly get the
> icmp too big packets back to a local host if the wan interface needs a
> smaller mtu?
>
> I'm using a FreeBSD machine as router/firewall, but its wan interface
> needs a smaller mtu (1392) than the default ethernet mtu. I have replicated
> this in a VM so I can test it. My simplified ipfw rules make it work for
> packets that are smaller than the wan mtu:
>
> #####
> net.inet.ip.fw.one_pass=0
> net.inet.ip.fw.verbose=1
> #####
> fwcmd="/sbin/ipfw -q"
> wan="vtnet0"
> lan="vtnet1"
> ${fwcmd} nat 123 config if ${wan} log
> ${fwcmd} add 1000 count log all from any to any
> ${fwcmd} add 5000 nat 123 ip4 from any to any via ${wan}
> ${fwcmd} add 6000 allow log all from any to any
> #####
> The wan ip of the firewall is 10.10.2.2 and the ip address of the host (on
> the lan side) I'm testing from is 10.10.1.3. And I did a ping to 10.10.5.5,
> which is on the other side of the wan interface.
>
> This works for packets smaller than the wan mtu. But if the packet is
> larger than the wan mtu, the icmp too big is generated, but with 127.0.0.1
> as the source and the wan ip as the destination and then sent via lo0 and
> it looks like this in the ipfw log:
>
> Dec  7 13:24:59 rtr kernel: ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.2.2
> out via lo0
>
> So I added a nat ipfw rule to catch that:
>
> ${fwcmd} add 5050 nat 123 ip4 from any to not 127.0.0.1 via lo0
>
> That helped partly because it was then able to recover the address of the
> host I was testing from and tried to send the packet out on the correct
> interface (vtnet1). Unfortunately it still had the source address of
> 127.0.0.1, which means it did not actually make it to the wire:
>
> ######
> Dec  7 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:8.0 10.10.1.3 10.10.5.5
> in via vtnet1
> Dec  7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:8.0 10.10.1.3 10.10.5.5
> in via vtnet1
> Dec  7 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:8.0 10.10.1.3 10.10.5.5
> out via vtnet0
> Dec  7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:8.0 10.10.2.2 10.10.5.5
> out via vtnet0
> Dec  7 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.2.2
> out via lo0
> Dec  7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:3.4 127.0.0.1 10.10.2.2
> out via lo0
> Dec  7 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.2.2
> in via lo0
> Dec  7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:3.4 127.0.0.1 10.10.1.3
> in via lo0
> Dec  7 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.1.3
> out via vtnet1
> Dec  7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:3.4 127.0.0.1 10.10.1.3
> out via vtnet1
> ######
>
> Once I have this sorted, there seems to be a similar problem with nptv6.
>
> Regards
>
> John
>
>

--000000000000557d3e05ef4b8c8d
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div>Hi,</div><div><br></div><div>Adding this patch does m=
ake it work for me. There might be better ways to do it. I have tested with=
 ping and ssh. In ping&#39;s case, ping reported:</div><div>frag needed and=
 DF set (MTU 1392)</div><div><br></div><div>In ssh&#39;s case I could see w=
ith tcpdump that the &quot;need to frag (mtu 1392)&quot; was sent back and =
the next packet&#39;s length was adjusted.</div><div><br></div><div>#####</=
div><div>06:29:59.869677 IP (tos 0x48, ttl 64, id 0, offset 0, flags [DF], =
proto TCP (6), length 1500)<br>=C2=A0 =C2=A0 10.10.1.3.64344 &gt; 10.10.7.7=
.22: Flags [.], cksum 0xb64d (correct), seq 39:1487, ack 39, win 1027, opti=
ons [nop,nop,TS val 260430893 ecr 926374970], length 1448<br>06:29:59.86995=
4 IP (tos 0x0, ttl 63, id 62454, offset 0, flags [none], proto ICMP (1), le=
ngth 596)<br>=C2=A0 =C2=A0 10.10.2.2 &gt; <a href=3D"http://10.10.1.3">10.1=
0.1.3</a>: ICMP 10.10.7.7 unreachable - need to frag (mtu 1392), length 576=
<br>	IP (tos 0x48, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), lengt=
h 1500, bad cksum e081 (-&gt;19b7)!)<br>=C2=A0 =C2=A0 10.10.1.3.64344 &gt; =
10.10.7.7.22: Flags [.], seq 39:1487, ack 39, win 1027, options [nop,nop,TS=
 val 260430893 ecr 926374970], length 1448<br>06:29:59.871301 IP (tos 0x48,=
 ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 1392)<br>=C2=A0 =
=C2=A0 10.10.1.3.64344 &gt; 10.10.7.7.22: Flags [.], cksum 0x6841 (correct)=
, seq 39:1379, ack 39, win 1027, options [nop,nop,TS val 260430893 ecr 9263=
74970], length 1340<br></div><div>#####<br></div><div><br></div><div>--- sy=
s/netinet/libalias/alias.c.orig =C2=A0 2022-05-12 04:54:03.000000000 +0000<=
br>+++ sys/netinet/libalias/alias.c =C2=A0 =C2=A0 =C2=A0 =C2=A02022-12-08 0=
5:42:25.127980000 +0000<br>@@ -365,6 +365,19 @@<br>=C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 lnk =3D NULL;<br>=C2=A0<br>=C2=A0 =C2=A0 =
=C2=A0 =C2=A0 if (lnk !=3D NULL) {<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 /*<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 If the packet was locally generated, it will have a<br>+ =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 loopback address as=
 source, which will not be handled<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 correctly. For now use the destination address =
as source<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 address. The correct source address might be the the<br>+ =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 interface address that th=
e packet will be going out on.<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 */<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (IN_=
LOOPBACK(ntohl(pip-&gt;ip_src.s_addr)) &amp;&amp;<br>+ =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 !IN_LOOPBACK(ntohl(pip-&gt;ip_ds=
t.s_addr))) {<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 DifferentialChecksum(&amp;pip-&gt;ip_sum,<br>+ =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 &amp;pip-&gt;ip_dst, &amp;pip-&gt;ip_src, 2);<br>+ =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pip-&gt;ip_=
src =3D pip-&gt;ip_dst;<br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 }<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (ip-&gt=
;ip_p =3D=3D IPPROTO_UDP || ip-&gt;ip_p =3D=3D IPPROTO_TCP) {<br>=C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 i=
nt accumulate, accumulate2;<br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 struct in_addr original_address;<=
br></div></div><br><div class=3D"gmail_quote"><div dir=3D"ltr" class=3D"gma=
il_attr">On Wed, 7 Dec 2022 at 16:33, John Hay &lt;<a href=3D"mailto:john@s=
anren.ac.za">john@sanren.ac.za</a>&gt; wrote:<br></div><blockquote class=3D=
"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(2=
04,204,204);padding-left:1ex"><div dir=3D"ltr"><div>Hi,</div><div><br></div=
><div>What would the proper ipfw rules be to make nat work and properly get=
 the icmp too big packets back to a local host if the wan interface needs a=
 smaller mtu?</div><div><br></div><div>I&#39;m using a FreeBSD machine as r=
outer/firewall, but its wan interface needs a smaller mtu (1392) than the d=
efault ethernet mtu. I have replicated this in a VM so I can test it. My si=
mplified ipfw rules make it work for packets that are smaller than the wan =
mtu:</div><div><br></div><div>#####</div><div>net.inet.ip.fw.one_pass=3D0<b=
r>net.inet.ip.fw.verbose=3D1<br></div><div>#####</div><div>fwcmd=3D&quot;/s=
bin/ipfw -q&quot;</div><div>wan=3D&quot;vtnet0&quot;</div><div>lan=3D&quot;=
vtnet1&quot;</div><div>${fwcmd} nat 123 config if ${wan} log</div><div>${fw=
cmd} add 1000 count log all from any to any</div><div>${fwcmd} add 5000 nat=
 123 ip4 from any to any via ${wan}</div><div>${fwcmd} add 6000 allow log a=
ll from any to any</div><div>#####</div><div>The wan ip of the firewall is =
10.10.2.2 and the ip address of the host (on the lan side) I&#39;m testing =
from is 10.10.1.3. And I did a ping to 10.10.5.5, which is on the other sid=
e of the wan interface.<br></div><div><br></div><div>This works for packets=
 smaller than the wan mtu. But if the packet is larger than the wan mtu, th=
e icmp too big is generated, but with 127.0.0.1 as the source and the wan i=
p as the destination and then sent via lo0 and it looks like this in the ip=
fw log:</div><div><br></div><div>Dec =C2=A07 13:24:59 rtr kernel: ipfw: 100=
0 Count ICMP:3.4 127.0.0.1 10.10.2.2 out via lo0</div><div><br></div><div>S=
o I added a nat ipfw rule to catch that:</div><div><br></div><div>${fwcmd} =
add 5050 nat 123 ip4 from any to not 127.0.0.1 via lo0</div><div><br></div>=
<div>That helped partly because it was then able to recover the address of =
the host I was testing from and tried to send the packet out on the correct=
 interface (vtnet1). Unfortunately it still had the source address of 127.0=
.0.1, which means it did not actually make it to the wire:<br></div><div><b=
r></div><div>######<br></div>Dec =C2=A07 14:17:31 rtr kernel: ipfw: 1000 Co=
unt ICMP:8.0 10.10.1.3 10.10.5.5 in via vtnet1<br>Dec =C2=A07 14:17:31 rtr =
kernel: ipfw: 6000 Accept ICMP:8.0 10.10.1.3 10.10.5.5 in via vtnet1<br>Dec=
 =C2=A07 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:8.0 10.10.1.3 10.10.5.5=
 out via vtnet0<br>Dec =C2=A07 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:=
8.0 10.10.2.2 10.10.5.5 out via vtnet0<br>Dec =C2=A07 14:17:31 rtr kernel: =
ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.2.2 out via lo0<br>Dec =C2=A07 14=
:17:31 rtr kernel: ipfw: 6000 Accept ICMP:3.4 127.0.0.1 10.10.2.2 out via l=
o0<br>Dec =C2=A07 14:17:31 rtr kernel: ipfw: 1000 Count ICMP:3.4 127.0.0.1 =
10.10.2.2 in via lo0<br>Dec =C2=A07 14:17:31 rtr kernel: ipfw: 6000 Accept =
ICMP:3.4 127.0.0.1 10.10.1.3 in via lo0<br>Dec =C2=A07 14:17:31 rtr kernel:=
 ipfw: 1000 Count ICMP:3.4 127.0.0.1 10.10.1.3 out via vtnet1<br>Dec =C2=A0=
7 14:17:31 rtr kernel: ipfw: 6000 Accept ICMP:3.4 127.0.0.1 10.10.1.3 out v=
ia vtnet1<br><div>######<br></div><div><br></div><div>Once I have this sort=
ed, there seems to be a similar problem with nptv6.</div><div><br></div><di=
v>Regards</div><div><br></div><div>John</div><div><br></div></div>
</blockquote></div>

--000000000000557d3e05ef4b8c8d--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAGv8uarEWoV=C-xMvZzq5m-eCxuNa%2BVFZSyGHQXYyyHrz6xSkg>