Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Jun 2022 09:18:11 +0200
From:      Gijs Peskens <gijs@peskens.net>
To:        freebsd-jail@freebsd.org,Doug Rabson <dfr@rabson.org>,freebsd-net@freebsd.org
Cc:        Samuel Karp <me@samuelkarp.com>
Subject:   Re: Container Networking for jails
Message-ID:  <FBE79414-8FC7-4357-A6DF-A2B4755DDF62@peskens.net>
In-Reply-To: <CACA0VUgNHPaDUkzgZh48DXtB9KxkLLcWGX=HgH8-yrVYkAEQwQ@mail.gmail.com>
References:  <CACA0VUgNHPaDUkzgZh48DXtB9KxkLLcWGX=HgH8-yrVYkAEQwQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3
Content-Type: text/plain;
 charset=utf-8
Content-Transfer-Encoding: quoted-printable

I went with exactly the same design for the Docker port I started a while a=
go=2E
The reason I went with that design is that there weren't any facilities to=
 modify a jails vent network configuration from outside of the jail=2E So i=
t's needed to enter the jail, run ifconfig et all=2E
Linux jails will lack a compatible ifconfig=2E=20
So having a parent FreeBSD based vnet jail ensures that networking can be =
configured for Linux children=2E=20

There is a risk to using the / filesystem: users that might be allowed to =
setup and configure containers run standard system tools as root on the roo=
t filesystem, even if they might not have root permission themselves=2E=2E =
If an exploit was to be ever found in any of those tools to modify files th=
at could be used as a step in a privilege escalation=2E=20

Imho, that risk is acceptable in a first port, but should be documented=2E=
 And ideally an option should be provided to use an alternative root if the=
 user deems the risk unacceptable=2E




On 30 June 2022 09:04:24 CEST, Doug Rabson <dfr@rabson=2Eorg> wrote:
>I wanted to get a quick sanity check for my current approach to
>container
>networking with buildah and podman=2E These systems use CNI (
>https://www=2Ecni=2Edev) to set up the network=2E This uses a sequence of
>'plugins' which are executables that perform successive steps in the
>process - a very common setup uses a 'bridge' plugin to add one half of
>an
>epair to a bridge and put the other half into the container's vnet=2E IP
>addresses are managed by an 'ipam' plugin and an optional 'portmap'
>plugin
>can be used to advertise container service ports on the host=2E All of
>these
>plugins run on the host with root privileges=2E
>
>In kubernetes and podman, it is possible for more than one container to
>share a network namespace in a 'pod'=2E Each container in the pod can
>communicate with its peers directly via localhost and they all share a
>single IP address=2E
>
>Mapping this over to jails, I am using one vnet jail to manage the
>network
>namespace and child jails of this to isolate the containers=2E The vnet
>jail
>uses '/' as its root path and the only things which run inside this
>jail
>are the CNI plugins=2E Using the host root means that a plugin can safely
>call host utilities such as ifconfig and route without having to trust
>the
>container's version of them=2E An important factor here is that the CNI
>plugins will only be run strictly before the container (to set up) or
>strictly after (to tear down) - at no point will CNI plugins be
>executed at
>the same time as container executables=2E
>
>The child jails use ip4/6=3Dinherit to share the vnet and each will use a
>root path to the container's contents in the same way as a normal
>non-hierarchical jail=2E
>
>Can anyone see any potential security problems here, particularly
>around
>the use of nested jails? I believe that the only difference between
>this
>setup and a regular non-nested jail is that the vnet outlives the
>container
>briefly before it is torn down=2E

--=20
Verstuurd vanaf mijn Android apparaat met K-9 Mail=2E Excuseer mijn beknop=
theid=2E
------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3
Content-Type: text/html;
 charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><head></head><body>I went with exactly the same design for the Docker=
 port I started a while ago=2E<br>The reason I went with that design is tha=
t there weren't any facilities to modify a jails vent network configuration=
 from outside of the jail=2E So it's needed to enter the jail, run ifconfig=
 et all=2E<br>Linux jails will lack a compatible ifconfig=2E <br>So having =
a parent FreeBSD based vnet jail ensures that networking can be configured =
for Linux children=2E <br><br>There is a risk to using the / filesystem: us=
ers that might be allowed to setup and configure containers run standard sy=
stem tools as root on the root filesystem, even if they might not have root=
 permission themselves=2E=2E If an exploit was to be ever found in any of t=
hose tools to modify files that could be used as a step in a privilege esca=
lation=2E <br><br>Imho, that risk is acceptable in a first port, but should=
 be documented=2E And ideally an option should be provided to use an altern=
ative root if the user deems the risk unacceptable=2E<br><br><br><br><br><d=
iv class=3D"gmail_quote">On 30 June 2022 09:04:24 CEST, Doug Rabson &lt;dfr=
@rabson=2Eorg&gt; wrote:<blockquote class=3D"gmail_quote" style=3D"margin: =
0pt 0pt 0pt 0=2E8ex; border-left: 1px solid rgb(204, 204, 204); padding-lef=
t: 1ex;">
<div dir=3D"ltr"><div>I wanted to get a quick sanity check for my current =
approach to container networking with buildah and podman=2E These systems u=
se CNI (<a href=3D"https://www=2Ecni=2Edev" target=3D"_blank">https://www=
=2Ecni=2Edev</a>) to set up the network=2E This uses a sequence of 'plugins=
' which are executables that perform successive steps in the process - a ve=
ry common setup uses a 'bridge' plugin to add one half of an epair&nbsp;to =
a bridge and put the other half into the container's vnet=2E IP addresses a=
re managed by an 'ipam' plugin and an optional 'portmap' plugin can be used=
 to advertise container service ports on the host=2E All of these plugins r=
un on the host with root privileges=2E</div><div><br></div><div>In kubernet=
es and podman, it is possible for more than one container to share a networ=
k namespace in a 'pod'=2E Each container in the pod can communicate with it=
s peers directly via localhost and they all share a single IP address=2E</d=
iv><div><br></div><div>Mapping this over to jails, I am using one vnet jail=
 to manage the network namespace and child jails of this to isolate the con=
tainers=2E The vnet jail uses '/' as its root path and the only things whic=
h run inside this jail are the CNI plugins=2E Using the host root means tha=
t a plugin can safely call host utilities&nbsp;such as ifconfig and route w=
ithout having to trust the container's version of them=2E An important fact=
or here is that the CNI plugins will only be run strictly before the contai=
ner (to set up) or strictly after (to tear down) - at no point will CNI plu=
gins be executed at the same time as container executables=2E</div><div><br=
></div><div>The child jails use ip4/6=3Dinherit to share the vnet and each =
will use a root path to the container's contents in the same way as a norma=
l non-hierarchical jail=2E</div><div><br></div><div>Can anyone see any pote=
ntial security problems here, particularly around the use of nested jails? =
I believe that the only difference between this setup and a regular non-nes=
ted jail is that the vnet outlives the container briefly before it is torn =
down=2E</div><font color=3D"#888888"></font></div>
</blockquote></div><br>-- <br>Verstuurd vanaf mijn Android apparaat met K-=
9 Mail=2E Excuseer mijn beknoptheid=2E</body></html>
------WBQ1AR96SFCU65D8I93TK5TO9I9EQ3--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?FBE79414-8FC7-4357-A6DF-A2B4755DDF62>