From nobody Mon Jan 6 15:56:55 2025 X-Original-To: freebsd-net@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4YRf260XjMz5jmBr for ; Mon, 06 Jan 2025 15:57:02 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-io1-xd2a.google.com (mail-io1-xd2a.google.com [IPv6:2607:f8b0:4864:20::d2a]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "WR4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4YRf250xTDz3wZy for ; Mon, 6 Jan 2025 15:57:01 +0000 (UTC) (envelope-from markjdb@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20230601 header.b=eCrrNtPI; spf=pass (mx1.freebsd.org: domain of markjdb@gmail.com designates 2607:f8b0:4864:20::d2a as permitted sender) smtp.mailfrom=markjdb@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=freebsd.org (policy=none) Received: by mail-io1-xd2a.google.com with SMTP id ca18e2360f4ac-844ef6275c5so505664439f.0 for ; Mon, 06 Jan 2025 07:57:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736179020; x=1736783820; darn=freebsd.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :from:to:cc:subject:date:message-id:reply-to; bh=xsdC8Yro+KPOgyMqmPanG8jHNtwrXGYuoMoWwfBlPjY=; b=eCrrNtPIn5uUfK5UvtY1VSbeJsZwlzw8DcxmC0ss7Yz8MAAk0zIaMCwB+bnpqUPxg8 JseAgnpftUN6DsEXblTLn722L+qE4YoOsDp+2bWKWFbsiB4Ltdm9Rxb1zDUqJXdV/Hlz ZySO/YBD3pYojiQnJM/DEp+pCQF5eq4R3v7IUAMmeaVlrvMTlZswpfkIepbJyr3hlmwE MHxfMkK/vOT//HgT7Do/H9qV1tx7tNgjM42eGmpDX/68oOwUh5ogHIXx4CBAgCDBwWAE oMCD1bPMp36yUPyPr/xqEe8eIOrcwB5ajttH1iBE4XZIKq9XmOGL/nKAAhfcU/zBj07X dTEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736179020; x=1736783820; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:sender :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xsdC8Yro+KPOgyMqmPanG8jHNtwrXGYuoMoWwfBlPjY=; b=MLIUFW1lE9lNTctDRLbaELK4gsqz/gKqll9i6iCmBMpQ5B53ljk0kbJ/XeAOGQuJ74 KMUbQqVVpyR+kJO/WTfIzZu+L8z0yPWZSH1z9coChheU9wEjFzSEwDj5xVZ9Kk+Hupxs ctC88o7JJWt6fmqM6fniRxULC/B46ykp3wPbOR1bdw6uwatB16RGmlPZm7l+NLhEs/rS 34QFA+B3WDypMutxNeGwyJmol2AHiwt2M4MhNmMo7MUqE2vaNidC3jlC3bLeeiEln4AQ Zyv+q9lZXsG3nRvQ51Yx0r3cj3RQV4jjt+oNMHIvPNpoa7MulzvXVE1igbrW8UDBoC9s ztbQ== X-Forwarded-Encrypted: i=1; AJvYcCU5r6CLO/i4KXF8VQwRKYRbar9ztFEyrwTkD+eTM+XmkCUAYmGiAIb1Rf9pbA4B33C23ZYJwwtbO9ps5g==@freebsd.org X-Gm-Message-State: AOJu0YymvhyXjAZfzRzJoXMp6qlY6M2+pgWOXcru8KN5xJIt0FFfv4RN TfKI1ltrTI9fJsCV65bZYfwIeleiR8MX7iRBE/eJOlD4+nM9yJkEMIVInA== X-Gm-Gg: ASbGnctBLNAbcwVsY/iJFaEFSo8FnxA516r3ujePdYmK4tcW5nMtO8xTPfsQz+E5YTF ma+8Gj6zcAzwM0jLU0MSo1V5waIB8KKgQE8V+ffrg/Py6Vabdlbv8ivSzLXORbF3ONnPlYBiQmj aqQ+1c1vYICe0JbxerLlSEC4stnV4VWGhAd4ieckS16+LgDbKHUm9Jb6eEolgfhOl/RqE6gaSp4 XDsrm+E5lKufKcDk0nrzeGSWRk6DuOAEJ3c/rgX6BIrJNXYSDDm/f4CJqpmB30XZIOwl/M= X-Google-Smtp-Source: AGHT+IFeqmOQpGpAtQa/HIOkxMapw/pPc9zUmreti2Yo1AA+EiaJ+Pu3eSzqhO6ABplRqhgG1oD8Kg== X-Received: by 2002:a05:6602:3a8b:b0:84a:5280:596f with SMTP id ca18e2360f4ac-84a52805b2dmr1804400239f.9.1736179019863; Mon, 06 Jan 2025 07:56:59 -0800 (PST) Received: from nuc (192-0-220-237.cpe.teksavvy.com. [192.0.220.237]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-8498d7dda03sm885053939f.17.2025.01.06.07.56.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jan 2025 07:56:58 -0800 (PST) Date: Mon, 6 Jan 2025 10:56:55 -0500 From: Mark Johnston To: Paul Vixie Cc: Santiago Martinez , Jamie Landeg-Jones , freebsd-net@freebsd.org Subject: Re: per-FIB socket binding Message-ID: References: <7772475.EvYhyI6sBW@dhcp-151.access.rits.tisf.net> <28EF197D-0D10-449A-A3C5-8B931F31CA6C@codenetworks.net> <38589000.XM6RcZxFsP@dhcp-151.access.rits.tisf.net> List-Id: Networking and TCP/IP with FreeBSD List-Archive: https://lists.freebsd.org/archives/freebsd-net List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-net@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <38589000.XM6RcZxFsP@dhcp-151.access.rits.tisf.net> X-Rspamd-Queue-Id: 4YRf250xTDz3wZy X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.60 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-1.00)[-0.999]; MID_RHS_NOT_FQDN(0.50)[]; FORGED_SENDER(0.30)[markj@freebsd.org,markjdb@gmail.com]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36:c]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20230601]; DMARC_POLICY_SOFTFAIL(0.10)[freebsd.org : SPF not aligned (relaxed), DKIM not aligned (relaxed),none]; MIME_GOOD(-0.10)[text/plain]; FREEMAIL_ENVFROM(0.00)[gmail.com]; ARC_NA(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; DKIM_TRACE(0.00)[gmail.com:+]; FROM_HAS_DN(0.00)[]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::d2a:from]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; FROM_NEQ_ENVFROM(0.00)[markj@freebsd.org,markjdb@gmail.com]; RCPT_COUNT_THREE(0.00)[4]; PREVIOUSLY_DELIVERED(0.00)[freebsd-net@freebsd.org]; RCVD_VIA_SMTP_AUTH(0.00)[]; MLMMJ_DEST(0.00)[freebsd-net@freebsd.org]; MISSING_XM_UA(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim] On Fri, Dec 27, 2024 at 08:48:48AM +0000, Paul Vixie wrote: > On Tuesday, December 24, 2024 3:34:45 AM UTC Santiago Martinez wrote: > > Hi, > > here’s another user of fibs. Each of our servers have multiple fibs and > > jails with fibs. I like the proposed. > > Santi > > Cool. Read on. > > On Tuesday, December 24, 2024 5:06:32 AM UTC Jamie Landeg-Jones wrote: > > Paul Vixie wrote: > > > ... > > I like that. I isolate 5 seperate networks by assigning a fib to each > > interface, and was initially surprised that I had to jump through ipfw > > hoops to get it to work properly, in fact at the end of my ipfw rules for > > these interfaces, just to guarantee no leaking, ... > > > > So, yes, I agree that it's crocky, and your proposal is how I originally > > expected it to work, and indeed, I can so no reason for it not to work that > > way, but am prepared to be enlightened if anyone else has an opinion on > > this. > > > > Jamie > > Groovy. See attached patch. This is just for TCP since I have no way to test SCTP and I > think UDP will have to be handled at the application layer. There are two one line changes > here. > > First, save the FIB number from the SYN in the syncache. This FIB number was in the > incoming m_pkthdr so I didn't need to change any function signatures. Note that if the > listener socket has a non-zero FIB number it will be used instead of the interface FIB > number -- it's more specific and likely to be right. > > Second, when the initial ACK arrives and it's time for the connection to exit from the > syncache and to become a socket, restore the original FIB number and apply it to the > cloned socket, which will already have inherited its FIB number from the listener socket. > > This works here. The diff is for a 14.2 kernel but is likely backward-portable. I'd very much > like to hear anybody's experience with this patch, or commentary on its approach and/or > advisability. I think the patch is probably a good idea, and the trick of only inheriting the packet's FIB if the socket's is non-zero would avoid breaking compatibility for most cases. One side effect of the patch is that a service listening in FIB 0 that has no route to the source address of a connection attempt from a different FIB would previously not accept such a connection, but now it will. Perhaps that's drastic enough to warrant a sysctl and/or sockopt to control this behaviour. > diff --git a/sys/netinet/tcp_input.c b/sys/netinet/tcp_input.c > index 83f85a50e..0e030f24f 100644 > --- a/sys/netinet/tcp_input.c > +++ b/sys/netinet/tcp_input.c > @@ -1057,7 +1057,7 @@ tcp_input_with_port(struct mbuf **mp, int *offp, int proto, uint16_t port) > } > inc.inc_fport = th->th_sport; > inc.inc_lport = th->th_dport; > - inc.inc_fibnum = so->so_fibnum; > + inc.inc_fibnum = so->so_fibnum || m->m_pkthdr.fibnum; > > /* > * Check for an existing connection attempt in syncache if > diff --git a/sys/netinet/tcp_syncache.c b/sys/netinet/tcp_syncache.c > index 15244a61d..a50648fa5 100644 > --- a/sys/netinet/tcp_syncache.c > +++ b/sys/netinet/tcp_syncache.c > @@ -805,6 +805,7 @@ syncache_socket(struct syncache *sc, struct socket *lso, struct mbuf *m) > */ > if ((so = solisten_clone(lso)) == NULL) > goto allocfail; > + so->so_fibnum = sc->sc_inc.inc_fibnum; It would be better to pass the fibnum to solisten_clone() and assign it there. Otherwise, the value of so_fibnum will be wrong for a short window during which the socket is passed to MAC and other hooks, which might have some surprising effects. > #ifdef MAC > mac_socketpeer_set_from_mbuf(m, so); > #endif