From owner-freebsd-jail@FreeBSD.ORG Sun Jun 7 08:37:40 2015 Return-Path: Delivered-To: freebsd-jail@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C6BC7B0E; Sun, 7 Jun 2015 08:37:40 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: from mail-wi0-x229.google.com (mail-wi0-x229.google.com [IPv6:2a00:1450:400c:c05::229]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 5B8B816ED; Sun, 7 Jun 2015 08:37:40 +0000 (UTC) (envelope-from mjguzik@gmail.com) Received: by wiwd19 with SMTP id d19so56885546wiw.0; Sun, 07 Jun 2015 01:37:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=QB0merXAn2Dx6KcaOqabWMD9r1JS4bxd/YvsvFw7UJ8=; b=F1BZ/+rIXdL5j2hYBg9anIZEi2mwsoAawxYAsMWIx2hHP8KxQtHWTCJ9YIqkdcm2aA P4rbUC93WToyzDEDYtCk8wLla9Sn0CDG5MkkDXYryI2y8757Pa2DeqKSGuAhM1zrHM4f bdk1IgWgv1KAhhsHXwveXHziVVzaLjE/h6ijj26ixOIOUrHlj7u47S+x/GV1nkl78M+g 1eUGqVWJd6Xmup+Zg54fCpkuP4NBx1ujjcVUAy22akB2ExmlWqZlJDqetVUMKxw11Ulj jZUGsCrkEz67y0Pw8Lk5SL2pfy7WnZpWphwK2nD+DX2I+3z5BxRl2tUtRXUO5XJVUSYf 5uyg== X-Received: by 10.180.74.132 with SMTP id t4mr11922677wiv.55.1433666258277; Sun, 07 Jun 2015 01:37:38 -0700 (PDT) Received: from dft-labs.eu (n1x0n-1-pt.tunnel.tserv5.lon1.ipv6.he.net. [2001:470:1f08:1f7::2]) by mx.google.com with ESMTPSA id ch2sm2680033wib.18.2015.06.07.01.37.36 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sun, 07 Jun 2015 01:37:36 -0700 (PDT) Date: Sun, 7 Jun 2015 10:37:34 +0200 From: Mateusz Guzik To: kikuchan Cc: freebsd-jail@freebsd.org, freebsd-stable@freebsd.org Subject: Re: [patch] separate SysV IPC namespace for jail Message-ID: <20150607083734.GB9182@dft-labs.eu> References: <20150605235348.GA9965@dft-labs.eu> <20150607013929.GA9182@dft-labs.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-jail@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Discussion about FreeBSD jail\(8\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 Jun 2015 08:37:40 -0000 On Sun, Jun 07, 2015 at 04:43:16PM +0900, kikuchan wrote: > Hi Mateusz, > > Thanks for your reply! > > First of all, I intend to *jail* SysV IPC user completely. > (unless user really want to interact with each other between jails) > > I think SysV IPC is simple but obsolete, so you can design whatever > you want for jail system. > Also, I want keep everything simple. > > My design (to be sure): > - Each entry of the list (shown in ipcs) belongs to a jail. > - Any operation to SHM/SEM/MSG attempted from another jail, will just > fail with EACCES. > But why? See below. > > > "Address space can be shared between multiple PROCESSES, what happens if > > such a pair ends up in different jails? Preferably such a scenario would > > be prohibited to avoid future accidents." > > > > However, sysvipc namespace sharing is an ok feature esp. with > > multi-level jails. In the simplest scenario upon jail creation you > > decide whether it gets its own namespace or inherits it. > > > > > > What about existing sysvshm mappings when jailing? > > > > > > Real (not jailed) environment is treated as a jail with jid=0 in kernel. > > > If you create sysvshm memory segment before entering a jail, the > > > segment simply owned by jid=0. > > > > > > > The point is you get a process with sysvshm segments from 2 different > > jails. Looks like solid trouble protential. > > Ok, I think I've got what you'd concerned. > > In my design, setting up such processes would be difficult. > This wouldn't be happend normally, because shared memory segments > should be obtained BEFORE entering a jail; > > 1. Create a segment on jid=0 with shmget() > 2. shmat() to attach (get void *ptr) > 3. fork() > 4. A child process entering to jid=1 with jail_attach() > 5. The child process and the parent process can share the address > space (via *ptr). > 6. If the child process do shmat() on the same ID again, it simply > failed with EACCES. > > It means, there is NO way to obtain a segment created in other jail > AFTER jailed (even if you're root or obtaining the segment created on > jid=0). This is sharing a page, not an address space (see below). This poses serious problems if actual separate namespaces are implemented, otherwise it only leaves a potential for bugs for no real gain. > > As a minimum this is singlethreading > > when jailing, prevention of jailing processes with shared virtual address > > spaces and ones with existing sysvshm mappings. All this is to reduce > > amount of bugs one would have to deal with. > > Virtual memory allocation and related stuff are protected and done by > kernel already, because it's an IPC (Inter Process Communication). > Moreover, you cannot change an owner of the IPC entry after creation, > so we don't need an additional protection in kernel. > Here is an example race: on fork memory mappings are copied first, sysvshm data is updated /later/. What happens if one of the calling threads enters a jail while some other thread is forking? This may be buggy as it is already, but that's roughly the scheme. It looks like we have some weird miscommunication here, so let me restate. I do see great benefit in having jail-aware ipcs. I do not believe the way to achieve it is to add jail-aware permission checks. Support in question should provide support for separate namespaces. The are several upsides, including lack of conflict between jails and plugged infoleaks. In general I don't understand why you insist on your approach, I does not have any advantage over separate namespaces that I could see. -- Mateusz Guzik