From nobody Tue Nov 29 01:04:24 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4NLkdQ3vdfz4jS1d for ; Tue, 29 Nov 2022 01:04:42 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4NLkdP5zvQz41Nw; Tue, 29 Nov 2022 01:04:41 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=gmail.com header.s=20210112 header.b=qYuLGjXH; spf=pass (mx1.freebsd.org: domain of rick.macklem@gmail.com designates 2607:f8b0:4864:20::636 as permitted sender) smtp.mailfrom=rick.macklem@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pl1-x636.google.com with SMTP id g10so11887775plo.11; Mon, 28 Nov 2022 17:04:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=fCDi4OyXwB+4eJqgJVWuqBymQJes4PgYHvG/51ImdN0=; b=qYuLGjXHj3pKsiEsOyz+3Udjpxv2b73xQK2e2PR6ieAjlLWYcpeAeMe2Sm6jLezXaI EnQp/1FRXryEodUsgi8RMhlF2jSoyjjrcIyhQFR+uguoKD5xFHScP3A9h1B92gOAa7pR /+n9eNVJF8JHR4xWHH9GhUWarD+aJD6y0RWJMvz9Tkw7ySRyomk8D4yBd37IPKHroAgI pCdnvsnyo68IkVvm+dufWf40Ivj2zXgJ8cBl0LiCbxgnTfkfB95ZHVWAGY3ZkkPjuSLS 2jaLea2kyAUbYs1UxJsVzu7hMCGYLgltSq4dhhP5fzoPFXZ+vfeOaeaoc18HB205wCsR fozg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fCDi4OyXwB+4eJqgJVWuqBymQJes4PgYHvG/51ImdN0=; b=FZRgH6mSnnHlcLpNHPMneNGmzzZ/7kuVhUShIerGvfKRyr+2sjVqUWo65vPEStFRui 9Mxn6ZQqIxfntgbbr6Y10ia+CYJGgxZ6AkRN3qBf+BM4hJl06m0Z9yn6my+CSvGUgrx7 XbS40++Mdy/AIxsRKM2XGtcdwWjhmoSifoaNgUfKaBBQ0BLRMZeZzkepP02q93314C8e CEqjsoUk26xyDeBUaF/A0e3GKEy/k2XvMxD0kgQuR3PDAocEXE1nLqef0IOrj5QE/5Nj wpGU25j8iagCBYF2Sk8nXdpw5YCrTeSYJmhe0GWoM8QrBQ4Zfh5WpI16pHM06rBkHbLK kqyw== X-Gm-Message-State: ANoB5pm5aHPcjWhcLXY/QeuFdSbQ9lz/iUbcvQbpltBD35nnZQEYhl68 Q1GbfYrovHfsTQcKo5gR39meW0l+h+cNMadWO3SbtNmv3HkSavY= X-Google-Smtp-Source: AA0mqf7A7rXvLb2GLXp792bdbojKBZk+mMhrgKijpZ7IUxGOeBZDZAA3zqKL4tQa7gF7OYAPvaXwIPu/K6XPHdlIqlk= X-Received: by 2002:a17:90b:3d90:b0:212:de1c:a007 with SMTP id pq16-20020a17090b3d9000b00212de1ca007mr64946533pjb.30.1669683877859; Mon, 28 Nov 2022 17:04:37 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: In-Reply-To: From: Rick Macklem Date: Mon, 28 Nov 2022 17:04:24 -0800 Message-ID: Subject: Re: RFC: nfsd in a vnet jail To: Alan Somers Cc: FreeBSD CURRENT , "Bjoern A. Zeeb" Content-Type: multipart/alternative; boundary="0000000000006e6e4805ee919271" X-Spamd-Result: default: False [-3.92 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.92)[-0.921]; DMARC_POLICY_ALLOW(-0.50)[gmail.com,none]; R_DKIM_ALLOW(-0.20)[gmail.com:s=20210112]; R_SPF_ALLOW(-0.20)[+ip6:2607:f8b0:4000::/36]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; RCVD_IN_DNSWL_NONE(0.00)[2607:f8b0:4864:20::636:from]; RCVD_TLS_LAST(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; MIME_TRACE(0.00)[0:+,1:+,2:~]; FROM_EQ_ENVFROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US]; FREEMAIL_ENVFROM(0.00)[gmail.com]; DKIM_TRACE(0.00)[gmail.com:+]; MID_RHS_MATCH_FROMTLD(0.00)[]; TAGGED_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEMAIL_FROM(0.00)[gmail.com]; DWL_DNSWL_NONE(0.00)[gmail.com:dkim]; RCVD_COUNT_TWO(0.00)[2] X-Rspamd-Queue-Id: 4NLkdP5zvQz41Nw X-Spamd-Bar: --- X-ThisMailContainsUnwantedMimeParts: N --0000000000006e6e4805ee919271 Content-Type: text/plain; charset="UTF-8" On Fri, Nov 25, 2022 at 9:06 PM Alan Somers wrote: > > > On Fri, Nov 25, 2022, 4:24 PM Rick Macklem wrote: > >> Hi, >> >> bz@ has encouraged me to fiddle with the nfsd >> so that it works in a vnet jail. >> I have now basically done so, specifically for >> NFSv4, since NFSv3 presents various issues. >> >> What I have not yet done is put global variables >> in the vnet. This needs to be done so that the nfsd >> can be run in multiple jail instances and/or in and >> outside of a jail. >> The problem is that there are 100s of global variables. >> >> I can see two approaches: >> 1 - Move them all into the vnet jail. This would imply >> that all the sysctls need to somehow be changed, >> which would seem to be a POLA violation. >> It also implies a lot of stuff in the vnet. >> 2 - Just move the global variables that will always >> differ from one nfsd to another (this would make >> the sysctls global and apply to all nfsds). >> This will keep the number of globals in the vnet >> smaller. >> >> I am currently leaning towards #2, put what do others >> think? >> >> rick >> ps: Personally, I don't know what use there is of >> running the nfsd inside a vnet jail, but bz@ has >> some use case. >> > > This is super-awesome! Thank you so much! I've got a use case too. I > think it would be fine to leave most of the settings global, like > max_threads. But we should probably decide on a case by case basis . > The minthreads, maxthreads happen to be handled via nfsd command line options, so the sysctls are not needed and they can be set per-prison. Most of the sysctls are for weird cases or tuning of the DRC. Since the DRC is only used for NFSv4.0 mounts and not NFSv4.1 or NFSv4.2 ones, tuning the DRC should not usually be necessary. I have left them global for now. If anyone identifies one that needs to be set per-prison, I can move it into the vnet. If you want to see them all: # sysctl -a | fgrep vfs.nfsd I have put a first patch up on phabricator as D37519. Although I listed three people as reviewers, anyone is welcome to test/comment/review. If you can't easily get the patch from phabricator, just email me and I'll send it to you. I think it will apply cleanly to main and, maybe, stable/13. You only need to build a kernel from patched sources to test it. There is a change to rc.d/nfsd, which you only need in the prison's etc/rc.d/nfsd. A very basic setup document (also definitely a work in progress) can be found at... https://people.freebsd.org/~rmacklem/nfsd-vnet-prison-setup.txt Let me know if you test it or have other suggestions, rick ps: Thanks everyone for your comments. If I have specific questions related to them, I'll post. Otherwise I am digesting them. --0000000000006e6e4805ee919271 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Fri, Nov 25, 2022 at 9:06 PM Alan Somers &= lt;asomers@freebsd.org> wrote= :


On Fri, Nov 25, 2022, 4:24 PM Rick Macklem <rick.macklem@gmail.com> wrote= :
Hi,

bz@ has encourage= d me to fiddle with the nfsd
so t= hat it works in a vnet jail.
I ha= ve now basically done so, specifically for
NFSv4, since NFSv3 presents various issues.

What I = have not yet done is put global variables
in the vnet. This needs to be done so that the nfsd
can be run in multiple jail instances and/or in= and
outside of a jail.
The problem is that there are 100s of glob= al variables.

I can see two approaches:
1 - Move them all into the vnet jail. This would imply=
=C2=A0 =C2=A0 that all the sysct= ls need to somehow be changed,
= =C2=A0 =C2=A0 which would seem to be a POLA violation.
=C2=A0 =C2=A0 It also implies a lot of stuff in the v= net.
2 - Just move the global var= iables that will always
=C2=A0 = =C2=A0 differ from one nfsd to another (this would make
=C2=A0 =C2=A0 the sysctls global and apply to all nf= sds).
=C2=A0 =C2=A0 This will kee= p the number of globals in the vnet
=C2=A0 =C2=A0 smaller.

I am currently leaning towards #2, = put what do others
think?

rick
ps: Personally, I don= 't know what use there is of
= =C2=A0 =C2=A0 running the nfsd inside a vnet jail, but bz@ has
=C2=A0 =C2=A0 some use case.

Th= is is super-awesome! Thank you so much! I've got a use case too.=C2=A0 = I think it would be fine to leave most of the settings global,=C2=A0 like m= ax_threads. But we should probably decide on a case by case basis .
The minthreads, maxthreads happen to be handled via nfsd command l= ine options, so
the sysctls are not needed and they can be set per-pris= on.
Most of the sysctls are for weird cases or tuning of the DRC. Since= the DRC is
only used for NFSv4.0 mounts and not NFSv4.1 or NFSv4.2 one= s, tuning the DRC
should not usually be necessary.

I have = left them global for now.

If anyone identifies one that needs = to be set per-prison, I can move it into
the vnet.
If you want to s= ee them all:
# sysctl -a | fgrep vfs.nfsd

I have put a fi= rst patch up on phabricator as D37519. Although I listed three
=
people a= s reviewers, anyone is welcome to test/comment/review.=C2=A0
If you can&= #39;t easily get the patch from phabricator, just email me and I'll
send it to y= ou. I think it will apply cleanly to main and, maybe, stable/13.
You only need to bu= ild a kernel from patched sources to test it. There is a
change to rc.d/nfsd, which = you only need in the prison's etc/rc.d/nfsd.

A very basic setup document (also defin= itely a work in progress) can be
found at...

Let me know if you test it or have other suggestions, rick
ps: Thanks everyo= ne for your comments. If I have specific questions related
=C2=A0 =C2=A0 to them, I= 'll post. Otherwise I am digesting them.

--0000000000006e6e4805ee919271--