From owner-freebsd-stable@freebsd.org Fri Feb 5 08:11:48 2021 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0C0A9540EAF for ; Fri, 5 Feb 2021 08:11:48 +0000 (UTC) (envelope-from freebsd-stable@gomor.org) Received: from onyphe.fr (super1.onyphe.io [54.36.107.100]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4DX7Rl5D4Cz4g5B for ; Fri, 5 Feb 2021 08:11:46 +0000 (UTC) (envelope-from freebsd-stable@gomor.org) Received: (qmail 76521 invoked by uid 0); 5 Feb 2021 08:11:45 -0000 Received: from unknown (HELO www.onyphe.io) (172.16.6.254) by smtpout.jail with SMTP; 5 Feb 2021 08:11:45 -0000 MIME-Version: 1.0 Date: Fri, 05 Feb 2021 09:11:45 +0100 From: GomoR To: John Baldwin Cc: freebsd-stable@freebsd.org Subject: Re: Suspected mbuf leak with Nginx + sendfile + TLS in 12.2-STABLE In-Reply-To: <9c56bfda-725c-9c2a-9db3-4599abcfeaa0@FreeBSD.org> References: <9c56bfda-725c-9c2a-9db3-4599abcfeaa0@FreeBSD.org> User-Agent: Roundcube Webmail/1.4.8 Message-ID: <069535216479ce00859e4bcbf499f8a1@gomor.org> X-Sender: freebsd-stable@gomor.org Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 4DX7Rl5D4Cz4g5B X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[] X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Feb 2021 08:11:48 -0000 On 2021-02-04 19:33, John Baldwin wrote: > None of the sendfile or KTLS changes from Netflix are in 12, they are > only > in 13 and later. I thought about that possibility, thank you for the clarification. >> Don't transmit mbufs that aren't yet ready on TOE sockets. >> This includes mbufs waiting for data from sendfile() I/O requests, or >> mbufs awaiting encryption for KTLS. >> https://github.com/freebsd/freebsd-src/commit/14c77f30b201bf76119d59678e72051c093333c2 > > This patch only applies to Chelsio T5/T6 NICs when using TOE (TCP > offload) > and doesn't affect freeing mbufs, it just fixes a race when the NIC > could > potentially send random garbage if it sends the mbuf before the > scheduled > disk I/O to populate it with data from disk has completed. Understood. >> NIC is: >> ix0: >> >> What can we do to help you find the root cause? > > The first step I would do if possible would be to bisect between the > last > known working version and the version that is known to be broken to > determine which commit introduced the problem. One thing that could > help > here is to see if you can reproduce the problem using a 12.2 kernel on > a > 12.1 world + ports. If you can, then you can limit your bisecting to > just > building new kernels which will make that process quicker. Thank you for the tip, I'll try that path and let you know. > You might also see if using a different NIC shows the same problem. If > not, then it might point to a regression in the NIC driver (or perhaps > in > iflib as ix uses iflib I believe). Unfortunately, not a possibility here. I did some other tests and found where the problem arise. In fact, we use proxy_pass directive within Nginx and the network flow goes through one public interface (ix0) and proxy_pass through a second (ix1) towards a remote machine. Changing the Nginx configuration to only go through ix0 does not cause the issue. So that's something about with passing packets between 2 NICs. I'll keep you posted. Regards,