Date: Fri, 05 Feb 2021 09:11:45 +0100 From: GomoR <freebsd-stable@gomor.org> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: Suspected mbuf leak with Nginx + sendfile + TLS in 12.2-STABLE Message-ID: <069535216479ce00859e4bcbf499f8a1@gomor.org> In-Reply-To: <9c56bfda-725c-9c2a-9db3-4599abcfeaa0@FreeBSD.org> References: <f6118f40fcac0e938e4050fc36a1e05e@gomor.org> <9c56bfda-725c-9c2a-9db3-4599abcfeaa0@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2021-02-04 19:33, John Baldwin wrote: > None of the sendfile or KTLS changes from Netflix are in 12, they are > only > in 13 and later. I thought about that possibility, thank you for the clarification. >> Don't transmit mbufs that aren't yet ready on TOE sockets. >> This includes mbufs waiting for data from sendfile() I/O requests, or >> mbufs awaiting encryption for KTLS. >> https://github.com/freebsd/freebsd-src/commit/14c77f30b201bf76119d59678e72051c093333c2 > > This patch only applies to Chelsio T5/T6 NICs when using TOE (TCP > offload) > and doesn't affect freeing mbufs, it just fixes a race when the NIC > could > potentially send random garbage if it sends the mbuf before the > scheduled > disk I/O to populate it with data from disk has completed. Understood. >> NIC is: >> ix0: <Intel(R) PRO/10GbE PCI-Express Network Driver> >> >> What can we do to help you find the root cause? > > The first step I would do if possible would be to bisect between the > last > known working version and the version that is known to be broken to > determine which commit introduced the problem. One thing that could > help > here is to see if you can reproduce the problem using a 12.2 kernel on > a > 12.1 world + ports. If you can, then you can limit your bisecting to > just > building new kernels which will make that process quicker. Thank you for the tip, I'll try that path and let you know. > You might also see if using a different NIC shows the same problem. If > not, then it might point to a regression in the NIC driver (or perhaps > in > iflib as ix uses iflib I believe). Unfortunately, not a possibility here. I did some other tests and found where the problem arise. In fact, we use proxy_pass directive within Nginx and the network flow goes through one public interface (ix0) and proxy_pass through a second (ix1) towards a remote machine. Changing the Nginx configuration to only go through ix0 does not cause the issue. So that's something about with passing packets between 2 NICs. I'll keep you posted. Regards,
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?069535216479ce00859e4bcbf499f8a1>