From nobody Tue Dec 13 18:00:24 2022 X-Original-To: jail@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NWmW620sMz4kP66 for ; Tue, 13 Dec 2022 18:00:34 +0000 (UTC) (envelope-from jamie@freebsd.org) Received: from gritton.org (gritton.org [162.220.209.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "gritton.org", Issuer "gritton.org" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NWmW43QYRz4BHq; Tue, 13 Dec 2022 18:00:32 +0000 (UTC) (envelope-from jamie@freebsd.org) Authentication-Results: mx1.freebsd.org; dkim=none; spf=softfail (mx1.freebsd.org: 162.220.209.3 is neither permitted nor denied by domain of jamie@freebsd.org) smtp.mailfrom=jamie@freebsd.org; dmarc=none Received: from gritton.org ([127.0.0.3]) (authenticated bits=0) by gritton.org (8.16.1/8.16.1) with ESMTPA id 2BDI0Ote014877; Tue, 13 Dec 2022 10:00:24 -0800 (PST) (envelope-from jamie@freebsd.org) List-Id: Discussion about FreeBSD jail(8) List-Archive: https://lists.freebsd.org/archives/freebsd-jail List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-jail@freebsd.org MIME-Version: 1.0 Date: Tue, 13 Dec 2022 10:00:24 -0800 From: James Gritton To: jail@freebsd.org Cc: bz@freebsd.org, "glebius@FreeBSD.org" , Andrew Gallatin Subject: Re: prison_flag() check in hot path of in_pcblookup() In-Reply-To: References: User-Agent: Roundcube Webmail/1.4.11 Message-ID: X-Sender: jamie@freebsd.org Content-Type: multipart/alternative; boundary="=_3054962f98fc689e6f81a2c8ac68acda" X-Spamd-Result: default: False [-3.10 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; MLMMJ_DEST(0.00)[jail@freebsd.org]; TO_MATCH_ENVRCPT_SOME(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; R_DKIM_NA(0.00)[]; FREEMAIL_CC(0.00)[freebsd.org,gmail.com]; ASN(0.00)[asn:30247, ipnet:162.220.208.0/22, country:US]; MID_RHS_MATCH_FROM(0.00)[]; TO_DN_EQ_ADDR_SOME(0.00)[]; DMARC_NA(0.00)[freebsd.org]; RCPT_COUNT_THREE(0.00)[4]; FREEFALL_USER(0.00)[jamie]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_SOFTFAIL(0.00)[~all]; TO_DN_SOME(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; RCVD_TLS_LAST(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Queue-Id: 4NWmW43QYRz4BHq X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N --=_3054962f98fc689e6f81a2c8ac68acda Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed On 2022-12-13 09:18, Andrew Gallatin wrote: > I was trying to improve the performance of in_pcblookup(), as it is a > very hot path for us (Netflix). One thing I noticed was the > prison_flag() check in in_pcblookup_hash_locked() can cause a cache > miss just by deref'ing the cred pointer, and it can also cause multiple > misses in tables with collisions by causing us to walk the entire chain > even after finding a perfect match. > > I'm curious why this check is needed. Can you explain it to me? It > originated in this commit: > > commit 413628a7e3d23a897cd959638d325395e4c9691b > Author: Bjoern A. Zeeb > Date: Sat Nov 29 14:32:14 2008 +0000 > > MFp4: > Bring in updated jail support from bz_jail branch. > > This enhances the current jail implementation to permit multiple > addresses per jail. In addtion to IPv4, IPv6 is supported as well. > > My thinking is that a jail will either use the host IP, and share its > port space, or it will have its own IP entirely (but I know nothing > about jails). In either case, a perfect 4-tuple match should be enough > to uniquely identify the connection. > > Even if this somehow is not the case and we have multiple connections > somehow sharing the same 4-tuple, how does checking the prison flag > help us? It would prefer the jailed connection over the non jailed, > but that would shadow a host connection. And if we had 2 jails sharing > the same 4-tuple, the first jail would win. > > I can't see how this check is doing anything useful, so I'd very much > like to remove this check if possible. Untested patch attached. For a complete 4-tuple, it should indeed be the case that a match would only ever identify a single prison. The later part of the function that examines wildcards definitely needs the check. I don't get the XXX comment about both being bound with SO_REUSEPORT, because I would only expect that to apply to listening, not to full connections. But I also expect Bjoern to know more than I do here... - Jamie --=_3054962f98fc689e6f81a2c8ac68acda Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8

On 2022-12-13 09:18, Andrew Gallatin wrote:

I was trying to improve the performance of in_pcblookup(), as it is a = very hot path for us (Netflix). One thing I noticed was the prison_flag() c= heck in in_pcblookup_hash_locked() can cause a cache miss just by deref'ing the cred pointer, and it can = also cause multiple misses in tables with collisions by causing us to walk = the entire chain even after finding a perfect match.
 
I'm curious why this check is needed.  Can you explain it to me?&= nbsp; It originated in this commit:
 
commit 413628a7e3d23a897cd959638d32539=
5e4c9691b
Author: Bjoern A. Zeeb <bz@FreeBSD.org>
Date:   Sat Nov 29 14:32:14 2008 +0000

    MFp4:
      Bring in updated jail support from bz_jail branch.
   =20
    This enhances the current jail implementation to permit multiple
    addresses per jail. In addtion to IPv4, IPv6 is supported as well.
 
My thinking is that a jail will either use the host IP, and share its = port space, or it will have its own IP entirely (but I know nothing about j= ails).  In either case, a perfect 4-tuple match should be enough to un= iquely identify the connection.    
 
Even if this somehow is not the case and we have multiple connections = somehow sharing the same 4-tuple, how does checking the prison flag help us= ?  It would prefer the jailed connection over the non jailed, but that= would shadow a host connection.  And if we had 2 jails sharing the sa= me 4-tuple, the first jail would win.
 
I can't see how this check is doing anything useful, so I'd very much = like to remove this check if possible.   Untested patch attached= =2E
 
For a complete 4-tuple, it should indeed be the case that a match woul= d only ever identify a single prison.  The later part of the function = that examines wildcards definitely needs the check.  I don't get the X= XX comment about both being bound with SO_REUSEPORT, because I would only e= xpect that to apply to listening, not to full connections. But I also expec= t Bjoern to know more than I do here...
 
- Jamie
--=_3054962f98fc689e6f81a2c8ac68acda--