From owner-freebsd-arch@FreeBSD.ORG Wed May 8 16:26:46 2013 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 35A1BAC4 for ; Wed, 8 May 2013 16:26:46 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id 0EAE9D9E for ; Wed, 8 May 2013 16:26:46 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6A57BB917; Wed, 8 May 2013 12:26:45 -0400 (EDT) From: John Baldwin To: Konstantin Belousov Subject: Re: Extending MADV_PROTECT Date: Wed, 8 May 2013 12:09:49 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <201305071433.27993.jhb@freebsd.org> <20130508095827.GK3047@kib.kiev.ua> In-Reply-To: <20130508095827.GK3047@kib.kiev.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201305081209.49429.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 08 May 2013 12:26:45 -0400 (EDT) Cc: arch@freebsd.org X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 May 2013 16:26:46 -0000 On Wednesday, May 08, 2013 5:58:27 am Konstantin Belousov wrote: > On Tue, May 07, 2013 at 02:33:27PM -0400, John Baldwin wrote: > > One of the issues I have with our current MADV_PROTECT is that it > > isn't very administrative-friendly. That is, as a sysadmin I can't > > easily protect arbitrary processes from the OOM killer. Instead, the > > binary has to be changed to invoke madvise(). Furthermore, once the > > protection is granted it can't be revoked. Also, any binaries that > > want this have to be run as root. Instead, I would like to be able > > to both set and revoke this for existing processes and possibly even > > allow it to be inherited (so I can tag a top-level daemon that forks > > and have all its future children be protected for example). To that > > end I've whipped up a simple patch (against 8, but should port to > > HEAD easily if folks think it is a good idea) to add a new pprotect() > > system call and userland program (protect) that can be used similar to > > ktrace(1) either as a modifier when running a new program or as a tool > > for setting or clearing protection for existing processes. > > > > The inherit feature isn't implemented yet, but it should be simple > > to do. One would simply need a new flag that PPROT_INHERIT sets that > > is checked on fork and propagates P_PROTECTED if it is set. Also, > > one other thought I had is that at some point we might want to make > > P_PROTECTED more fine-grained, e.g. by allowing for OOM "priorities". > > To that end, it may make sense to add a new argument to protect, > > though you could also reserve part of the 'op' parameter to encode a > > priority. > > Wouldn't the pprot_setchildren() miss a child for which the new pid and > struct proc are already allocated in the do_fork(), but which is not yet > linked into the process tree ? If true, I think this does not > fulfill the promise of the PPROT_DESCEND. ktrace has the same issue, and really, this is just a race. If the user had run the command a few nanoseconds earlier the proc wouldn't be allocated at all, and I doubt a user would notice the difference in those two cases. If you are doing this programmatically then that is a race that the program can handle. It isn't any different from having a new process begin its fork() a few nanoseconds after this returns either. This is why if you want that behavior you would use -di (and applies equally to ktrace). > It seems that the patch posted missed the chunk for sys/proc.h. > For HEAD, you probably need e.g. p_flag2 and P2_PROTECTED instead. P_PROTECTED already exists on both HEAD and 8.x (the MADV_PROTECT it is extending is quite old). What we would need a p_flag2 for would be to add a new P_PROTECTED_INHERIT if we decided we wanted to support that. > Since the syscall is mean to be extended in the future, would it make > more sense to add a multiplexer, e.g. procctl(2), one operation of which > would be PROCCTL_PROTECT ? Do we expect it to do more than adjust protection? We already have a few other process-control system calls (e.g. ptrace()). It's hard to ensure it is sufficiently generic when only abstracting from one use case. -- John Baldwin