Date: Thu, 18 Jul 2024 21:23:02 +0200 From: tuexen@freebsd.org To: Alan Somers <asomers@freebsd.org> Cc: Junho Choi <junho.choi@gmail.com>, FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: TCP Success Story (was Re: TCP_RACK, TCP_BBR, and firewalls) Message-ID: <400A46A2-E75F-4BE3-BFFF-340CF4557322@freebsd.org> In-Reply-To: <CAOtMX2i5-7=qvPyb-tbJjkKwSKv6mawxZ-jeHG9UaPi2AY6CRg@mail.gmail.com> References: <CAOtMX2iLv5OW4jQiBOHqMvcqkQSznTyO-eWMrOcHWbpeyaeRsg@mail.gmail.com> <C7467BCD-7232-4C6C-873E-EEC2482214A7@freebsd.org> <CAOtMX2hGYfm0U0L25-vHSX0iOyKCbZydaAzye6Y6U59mQeF7rA@mail.gmail.com> <B86DCBA6-542F-4951-A726-3A66D3D640D6@freebsd.org> <CAJ5e%2BHAvNbazCkd_G_E=QojqknQe23khCimyKWk=TTyzHr2j0Q@mail.gmail.com> <B2A62C1B-9BD4-4F82-A296-07A3B41CA402@freebsd.org> <CAOtMX2i5-7=qvPyb-tbJjkKwSKv6mawxZ-jeHG9UaPi2AY6CRg@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
> On 18. Jul 2024, at 20:37, Alan Somers <asomers@freebsd.org> wrote: >=20 > Coexist how? Do you mean that one socket can use one and a different > socket uses the other? That makes sense. Correct. Best regards Michael >=20 > On Thu, Jul 18, 2024 at 10:34=E2=80=AFAM <tuexen@freebsd.org> wrote: >>=20 >>> On 18. Jul 2024, at 15:00, Junho Choi <junho.choi@gmail.com> wrote: >>>=20 >>> Alan - this is a great result to see. Thanks for experimenting. >>>=20 >>> Just curious why bbr and rack don't co-exist? Those are two separate = things. >>> Is it a current bug or by design? >> Technically RACK and BBR can coexist. The problem was with pf and/or = LRO. >>=20 >> But this is all fixed now in 14.1 and head. >>=20 >> Best regards >> Michael >>>=20 >>> BR, >>>=20 >>> On Thu, Jul 18, 2024 at 5:27=E2=80=AFAM <tuexen@freebsd.org> wrote: >>>> On 17. Jul 2024, at 22:00, Alan Somers <asomers@freebsd.org> wrote: >>>>=20 >>>> On Sat, Jul 13, 2024 at 1:50=E2=80=AFAM <tuexen@freebsd.org> wrote: >>>>>=20 >>>>>> On 13. Jul 2024, at 01:43, Alan Somers <asomers@FreeBSD.org> = wrote: >>>>>>=20 >>>>>> I've been experimenting with RACK and BBR. In my environment, = they >>>>>> can dramatically improve single-stream TCP performance, which is >>>>>> awesome. But pf interferes. I have to disable pf in order for = them >>>>>> to work at all. >>>>>>=20 >>>>>> Is this a known limitation? If not, I will experiment some more = to >>>>>> determine exactly what aspect of my pf configuration is = responsible. >>>>>> If so, can anybody suggest what changes would have to happen to = make >>>>>> the two compatible? >>>>> A problem with same symptoms was already reported and fixed in >>>>> https://reviews.freebsd.org/D43769 >>>>>=20 >>>>> Which version are you using? >>>>>=20 >>>>> Best regards >>>>> Michael >>>>>>=20 >>>>>> -Alan >>>>=20 >>>> TLDR; tcp_rack is good, cc_chd is better, and tcp_bbr is best >>>>=20 >>>> I want to follow up with the list to post my conclusions. Firstly >>>> tuexen@ helped me solve my problem: in FreeBSD 14.0 there is a = 3-way >>>> incompatibility between (tcp_bbr || tcp_rack) && lro && pf. I can >>>> confirm that tcp_bbr works for me if I either disable LRO, disable = PF, >>>> or switch to a 14.1 server. >>>>=20 >>>> Here's the real problem: on multiple production servers, = downloading >>>> large files (or ZFS send/recv streams) was slow. After ruling out >>>> many possible causes, wireshark revealed that the connection was >>>> suffering about 0.05% packet loss. I don't know the source of that >>>> packet loss, but I don't believe it to be congestion-related. = Along >>>> with a 54ms RTT, that's a fatal combination for the throughput of >>>> loss-based congestion control algorithms. According to the Mathis >>>> Formula [1], I could only expect 1.1 MBps over such a connection. >>>> That's actually worse than what I saw. With default settings >>>> (cc_cubic), I averaged 5.6 MBps. Probably Mathis's assumptions are >>>> outdated, but that's still pretty close for such a simple formula >>>> that's 27 years old. >>>>=20 >>>> So I benchmarked all available congestion control algorithms for >>>> single download streams. The results are summarized in the table >>>> below. >>>>=20 >>>> Algo Packet Loss Rate Average Throughput >>>> vegas 0.05% 2.0 MBps >>>> newreno 0.05% 3.2 MBps >>>> cubic 0.05% 5.6 MBps >>>> hd 0.05% 8.6 MBps >>>> cdg 0.05% 13.5 MBps >>>> rack 0.04% 14 MBps >>>> htcp 0.05% 15 MBps >>>> dctcp 0.05% 15 MBps >>>> chd 0.05% 17.3 MBps >>>> bbr 0.05% 29.2 MBps >>>> cubic 10% 159 kBps >>>> chd 10% 208 kBps >>>> bbr 10% 5.7 MBps >>>>=20 >>>> RACK seemed to achieve about the same maximum bandwidth as BBR, = though >>>> it took a lot longer to get there. Also, with RACK, wireshark >>>> reported about 10x as many retransmissions as dropped packets, = which >>>> is suspicious. >>>>=20 >>>> At one point, something went haywire and packet loss briefly spiked = to >>>> the neighborhood of 10%. I took advantage of the chaos to repeat = my >>>> measurements. As the table shows, all algorithms sucked under = those >>>> conditions, but BBR sucked impressively less than the others. >>>>=20 >>>> Disclaimer: there was significant run-to-run variation; the = presented >>>> results are averages. And I did not attempt to measure packet loss >>>> exactly for most runs; 0.05% is merely an average of a few selected >>>> runs. These measurements were taken on a production server running = a >>>> real workload, which introduces noise. Soon I hope to have the >>>> opportunity to repeat the experiment on an idle server in the same >>>> environment. >>>>=20 >>>> In conclusion, while we'd like to use BBR, we really can't until we >>>> upgrade to 14.1, which hopefully will be soon. So in the meantime >>>> we've switched all relevant servers from cubic to chd, and we'll >>>> reevaluate BBR after the upgrade. >>> Hi Alan, >>>=20 >>> just to be clear: the version of BBR currently implemented is >>> BBR version 1, which is known to be unfair in certain scenarios. >>> Google is still working on BBR to address this problem and improve >>> it in other aspects. But there is no RFC yet and the updates haven't >>> been implemented yet in FreeBSD. >>>=20 >>> Best regards >>> Michael >>>>=20 >>>> [1]: = https://www.slac.stanford.edu/comp/net/wan-mon/thru-vs-loss.html >>>>=20 >>>> -Alan >>>=20 >>>=20 >>>=20 >>>=20 >>> -- >>> Junho Choi <junho dot choi at gmail.com> | https://saturnsoft.net >>=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?400A46A2-E75F-4BE3-BFFF-340CF4557322>