Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 May 2021 01:12:47 +0000
From:      bugzilla-noreply@freebsd.org
To:        bugs@FreeBSD.org
Subject:   [Bug 255830] dummynet(4) queues/pipes do not work inside of a VNET jail
Message-ID:  <bug-255830-227@https.bugs.freebsd.org/bugzilla/>

next in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D255830

            Bug ID: 255830
           Summary: dummynet(4) queues/pipes do not work inside of a VNET
                    jail
           Product: Base System
           Version: 13.0-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: kumba@gentoo.org

I have been attempting to build a jail to compartment routing duties on my
router appliance for the past few days.  After running into a curious situa=
tion
where, after configuring and starting the jail, I was able to ping out to t=
he
IPv4 internet, but unable to do DNS queries or receive DNS responses.  After
much debugging, disabling the dummynet portions of my firewall script solved
the problem.

Doing some research on this leads to me discovering this tweet from 2011:
"""
Shawn Webb @lattera
setting any bandwidth limit in a #dummynet pipe effectively kills vnet jail
networking in #freebsd
"""
https://twitter.com/lattera/status/149222769014472704

Despite that tweet being almost ten years old, it does appear that is it bo=
th
accurate and still applicable to modern-day FreeBSD.  As part of my attempt=
 to
defeat bufferbloat on my cable modem, I do set bandwidth limits in my firew=
all
script using the FQ_CoDeL algorithm:

---
# Configure in-kernel NAT.  The NAT number is arbitrary.
${ipfw} nat 1 config if ${wan} deny_in same_ports unreg_only reset

# Configure the dummynet(4) subsystem to manage the available bandwidth and
# avoid a thing called "bufferbloat".  Two pipes are setup, one for downstr=
eam
# and the other for upstream.  Two schedulers using the FQ_CoDeL algorithm =
are
# then attached to the pipes, with the downstream scheduler given some tuni=
ng.
# Last, two queues are attached to the schedulers with their weights set at
# 100, and these queues are used further down in the firewall code.
${ipfw} pipe 1 config bw 294MBit/s burst 1048576       # Download pipe
${ipfw} pipe 2 config bw 12MBit/s                      # Upload pipe

${ipfw} sched 1 config pipe 1 type fq_codel target 5ms quantum 6000 flows 2=
048
interval 300 limit 15360 ecn
${ipfw} sched 2 config pipe 2 type fq_codel ecn

${ipfw} queue 01 config sched 2 weight 100             # Outbound TCP ACK
${ipfw} queue 03 config sched 2 weight  90             # Outbound
HTTP/HTTPS/RSYNC
${ipfw} queue 05 config sched 2 weight  85             # Outbound DNS
${ipfw} queue 07 config sched 2 weight  65             # Outbound Steam Cli=
ent
${ipfw} queue 09 config sched 2 weight  55             # Outbound
IMAP/POP3/SMTP

${ipfw} queue 02 config sched 1 weight  100            # Inbound TCP ACK
${ipfw} queue 04 config sched 1 weight  90             # Inbound
HTTP/HTTPS/RSYNC
${ipfw} queue 06 config sched 1 weight  85             # Inbound DNS
${ipfw} queue 08 config sched 1 weight  65             # Inbound Steam Clie=
nt
${ipfw} queue 10 config sched 1 weight  55             # Inbound IMAP/POP3/=
SMTP
---

To confirm this, I only enabled queue rules for HTTP and DNS, then ran nslo=
okup
on another machine to resolve a well-known domain from Google's DNS servers=
.=20
Nslookup usually tries to resolve about three times before giving up with
SERVFAIL, and looking at the output of `ipfw show', I was able to see that =
the
packet counter for queue #10 was at '3' exactly after nslookup timed out.=20
Removing the queue rules allows traffic to flow normally (I am actually run=
ning
on this particular firewall/routing setup as I write this).  The reason ICMP
worked is because I never configured a queue for it.  This sent me down qui=
te a
few rabbit holes, as it is really difficult to debug dummynet via an ipfw
script.

I have not tried using other fq_* algorithms outside of CoDeL, but given th=
at
is fairly new and the tweet is from 2011, I doubt the issue is specific to a
single algorithm and likely lies somewhere in dummynet itself.  It would be
nice to get some expert eyes on this and possibly even a fix.  Meanwhile, I
will have to find another way to make this setup work with both VNET jails =
and
dummynet.

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-255830-227>