Date: Thu, 30 Jun 2022 09:18:11 +0200 From: Gijs Peskens <gijs@peskens.net> To: freebsd-jail@freebsd.org,Doug Rabson <dfr@rabson.org>,freebsd-net@freebsd.org Cc: Samuel Karp <me@samuelkarp.com> Subject: Re: Container Networking for jails Message-ID: <FBE79414-8FC7-4357-A6DF-A2B4755DDF62@peskens.net> In-Reply-To: <CACA0VUgNHPaDUkzgZh48DXtB9KxkLLcWGX=HgH8-yrVYkAEQwQ@mail.gmail.com> References: <CACA0VUgNHPaDUkzgZh48DXtB9KxkLLcWGX=HgH8-yrVYkAEQwQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable I went with exactly the same design for the Docker port I started a while a= go=2E The reason I went with that design is that there weren't any facilities to= modify a jails vent network configuration from outside of the jail=2E So i= t's needed to enter the jail, run ifconfig et all=2E Linux jails will lack a compatible ifconfig=2E=20 So having a parent FreeBSD based vnet jail ensures that networking can be = configured for Linux children=2E=20 There is a risk to using the / filesystem: users that might be allowed to = setup and configure containers run standard system tools as root on the roo= t filesystem, even if they might not have root permission themselves=2E=2E = If an exploit was to be ever found in any of those tools to modify files th= at could be used as a step in a privilege escalation=2E=20 Imho, that risk is acceptable in a first port, but should be documented=2E= And ideally an option should be provided to use an alternative root if the= user deems the risk unacceptable=2E On 30 June 2022 09:04:24 CEST, Doug Rabson <dfr@rabson=2Eorg> wrote: >I wanted to get a quick sanity check for my current approach to >container >networking with buildah and podman=2E These systems use CNI ( >https://www=2Ecni=2Edev) to set up the network=2E This uses a sequence of >'plugins' which are executables that perform successive steps in the >process - a very common setup uses a 'bridge' plugin to add one half of >an >epair to a bridge and put the other half into the container's vnet=2E IP >addresses are managed by an 'ipam' plugin and an optional 'portmap' >plugin >can be used to advertise container service ports on the host=2E All of >these >plugins run on the host with root privileges=2E > >In kubernetes and podman, it is possible for more than one container to >share a network namespace in a 'pod'=2E Each container in the pod can >communicate with its peers directly via localhost and they all share a >single IP address=2E > >Mapping this over to jails, I am using one vnet jail to manage the >network >namespace and child jails of this to isolate the containers=2E The vnet >jail >uses '/' as its root path and the only things which run inside this >jail >are the CNI plugins=2E Using the host root means that a plugin can safely >call host utilities such as ifconfig and route without having to trust >the >container's version of them=2E An important factor here is that the CNI >plugins will only be run strictly before the container (to set up) or >strictly after (to tear down) - at no point will CNI plugins be >executed at >the same time as container executables=2E > >The child jails use ip4/6=3Dinherit to share the vnet and each will use a >root path to the container's contents in the same way as a normal >non-hierarchical jail=2E > >Can anyone see any potential security problems here, particularly >around >the use of nested jails? I believe that the only difference between >this >setup and a regular non-nested jail is that the vnet outlives the >container >briefly before it is torn down=2E --=20 Verstuurd vanaf mijn Android apparaat met K-9 Mail=2E Excuseer mijn beknop= theid=2E ------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable <html><head></head><body>I went with exactly the same design for the Docker= port I started a while ago=2E<br>The reason I went with that design is tha= t there weren't any facilities to modify a jails vent network configuration= from outside of the jail=2E So it's needed to enter the jail, run ifconfig= et all=2E<br>Linux jails will lack a compatible ifconfig=2E <br>So having = a parent FreeBSD based vnet jail ensures that networking can be configured = for Linux children=2E <br><br>There is a risk to using the / filesystem: us= ers that might be allowed to setup and configure containers run standard sy= stem tools as root on the root filesystem, even if they might not have root= permission themselves=2E=2E If an exploit was to be ever found in any of t= hose tools to modify files that could be used as a step in a privilege esca= lation=2E <br><br>Imho, that risk is acceptable in a first port, but should= be documented=2E And ideally an option should be provided to use an altern= ative root if the user deems the risk unacceptable=2E<br><br><br><br><br><d= iv class=3D"gmail_quote">On 30 June 2022 09:04:24 CEST, Doug Rabson <dfr= @rabson=2Eorg> wrote:<blockquote class=3D"gmail_quote" style=3D"margin: = 0pt 0pt 0pt 0=2E8ex; border-left: 1px solid rgb(204, 204, 204); padding-lef= t: 1ex;"> <div dir=3D"ltr"><div>I wanted to get a quick sanity check for my current = approach to container networking with buildah and podman=2E These systems u= se CNI (<a href=3D"https://www=2Ecni=2Edev" target=3D"_blank">https://www= =2Ecni=2Edev</a>) to set up the network=2E This uses a sequence of 'plugins= ' which are executables that perform successive steps in the process - a ve= ry common setup uses a 'bridge' plugin to add one half of an epair to = a bridge and put the other half into the container's vnet=2E IP addresses a= re managed by an 'ipam' plugin and an optional 'portmap' plugin can be used= to advertise container service ports on the host=2E All of these plugins r= un on the host with root privileges=2E</div><div><br></div><div>In kubernet= es and podman, it is possible for more than one container to share a networ= k namespace in a 'pod'=2E Each container in the pod can communicate with it= s peers directly via localhost and they all share a single IP address=2E</d= iv><div><br></div><div>Mapping this over to jails, I am using one vnet jail= to manage the network namespace and child jails of this to isolate the con= tainers=2E The vnet jail uses '/' as its root path and the only things whic= h run inside this jail are the CNI plugins=2E Using the host root means tha= t a plugin can safely call host utilities such as ifconfig and route w= ithout having to trust the container's version of them=2E An important fact= or here is that the CNI plugins will only be run strictly before the contai= ner (to set up) or strictly after (to tear down) - at no point will CNI plu= gins be executed at the same time as container executables=2E</div><div><br= ></div><div>The child jails use ip4/6=3Dinherit to share the vnet and each = will use a root path to the container's contents in the same way as a norma= l non-hierarchical jail=2E</div><div><br></div><div>Can anyone see any pote= ntial security problems here, particularly around the use of nested jails? = I believe that the only difference between this setup and a regular non-nes= ted jail is that the vnet outlives the container briefly before it is torn = down=2E</div><font color=3D"#888888"></font></div> </blockquote></div><br>-- <br>Verstuurd vanaf mijn Android apparaat met K-= 9 Mail=2E Excuseer mijn beknoptheid=2E</body></html> ------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FBE79414-8FC7-4357-A6DF-A2B4755DDF62>