From owner-freebsd-current@FreeBSD.ORG Thu Aug 28 08:34:37 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4036316A4BF for ; Thu, 28 Aug 2003 08:34:37 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 468FA43F93 for ; Thu, 28 Aug 2003 08:34:36 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.9/8.12.9) with ESMTP id h7SFYArO035872; Thu, 28 Aug 2003 11:34:10 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)h7SFYADU035869; Thu, 28 Aug 2003 11:34:10 -0400 (EDT) Date: Thu, 28 Aug 2003 11:34:09 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Joe Greco In-Reply-To: <200308281432.h7SEWsma031504@aurora.sol.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: freebsd-current@freebsd.org Subject: Re: Someone help me understand this...? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 28 Aug 2003 15:34:37 -0000 On Thu, 28 Aug 2003, Joe Greco wrote: > > On Wed, 27 Aug 2003, Joe Greco wrote: > > > The specific OS below is 5.1-RELEASE but apparently this happens on 4.8 > > > as well. > > > > Could you confim this happens with 4.8? The access control checks there > > are substantially different, and I wouldn't expect the behavior you're > > seeing on 4.8... > > Rather difficult. I'll see if the client will let me trash a production > system, but usually people don't like $40K servers handing out a few > hundred megabits of traffic going out of service. We were trying to fix > it on the scratch box (which happens to have 5.1R on it) and then were > going to see how it fared on the production systems. I think it's safe to assume that if you're seeing a similar failure, there's a different source given my reading of the code, but I'm willing to be proven wrong. It's probably not worth the investment if you're talking about large quantities of money, though. > > Clearly, unbreaking applications like Diablo by default is desirable. At > > least OpenBSD has similar protections to these turned on by default, and > > possibly other systems as well. As 5.x sees more broad use, we may well > > bump into other cases where applications have similar behavior: they rely > > on no special protections once they've given up privilege. I wonder if > > Diablo can run unmodified on OpenBSD; it could be they don't include > > SIGALRM on the list of "protect against" signals, or it could be that they > > modify Diablo for their environment to use an alternative signaling > > mechanism. Another alternative to this patch would simply be to add > > SIGARLM to the list of acceptable signals to deliver in the > > privilege-change case. > > I wonder if it would be reasonable to have some sort of interface that > allowed a program to tell FreeBSD not to set this flag... if not, at > least if there was a sysctl, code could be added so that the daemon > checked the flag when starting and errored out if it wasn't set. We actually have such an interface, but it's only enabled for the purposes of regression testing. If you compile "options REGRESSION" into the kernel configuration, a new system call __setsugid(), is exposed to applications. It's used by src/tools/regression/security/proc_to_proc to make it easier to set up process pairs for regression testing of inter-process access control. When I added it, there was some interest in just making it setsugid() and exposing it to all processes. Maybe we should just go this route for 5.2-RELEASE. Invoking it with a (0) argument would mean the application writer accepted the inherrent risks. However, this would open the application to the risks of debugging attachment, which are probably greater than the signal risks in most cases. It's not clear what the best way to express "I want to accept but not " would be... So far, it sounds like we have three work-arounds in the pot, perhaps we can think of something better: (1) Remove SIGALRM from the list of prohibited signals in the P_SUGID case. Not clear what the risks are here based on common application use, but this is an easy change to make. (2) Add setsugid() to allow applications to give up implicit protections associated with credential changes. This comes with greater risks, I suspect, since it opens up applications to more explicit vulnerabilities: signal attacks require more sophistication and luck, but debugging attacks are "easy". (3) Allow administrators to selectively disable the more restrictive signal checks at a system scope using a sysctl. This is easy, and comes with no risks as long as the setting is unchanged (the default in the patch I sent out earlier). I'm tempted to commit (1) immediately to allow a workaround if we get nothing else figured out, and to think some more about (2) and (3). Another possibility would be to encourage application writers to avoid overloading signals that already have "meanings", and rely on the USR signals. I assume the reason Diablo uses ALRM is that the USR signals already have assigned semantics? > > BTW, it's worth noting that the mechanism Diablo is using to give up > > privilege actually does retain some "privileges" -- it doesn't, for > > example, synchronize its resource limits with those of the user it is > > switching to, so it retains the starting resource limits (likely those of > > the root account). > > That's actually preferred in most cases. News servers almost always eat > far more resources than whatever limits you might set by default, which > just turns into telling people to remove the limits or use root's > limits. Generally if a news package bumps limits bad things happen. Right now, most applications in the base system make use of the setusercontext() call to modify their protections as part of a switch of users. They often pass in the flag LOGIN_SETALL and then remove the bits they don't need, such as LOGIN_SETRESOURCES. This also has the side effect of setting up things like the umask based on the user default in login.conf, setting the default paths, etc. This may be overkill for what you're looking for, though, and there's a lot of value to "if it ain't broke, don't fix it". > > A preferred structuring of privilege separation > > attempts to avoid this scenario by containing privilege in a process that > > is as independent as possible from the unprivileged processes, and uses > > file descriptor passing to get a bound port to the unprivileged processes, > > rather than credential manipulation which is fairly failure-prone. > > Yes, and such a thing is actually available, though it introduces some > new issues, because the daemons can be configured to allow various bound > ports (needing a variable number of fd's, etc) and this also breaks > legacy sites where people have custom startup scripts. Ugh. We did > that originally so people could get core dumps on FreeBSD. Yeah. The point on application behavior is probably to affect future application development and changes -- we still need to address current configurations. > Yeah, yeah, it's Matt Dillon legacy code. Matt tended to ignore error > returns from things where an error was not expected and even if one was > reported, nothing (beyond a message) could be done. It actually took me > a while to isolate the kill issue as a result, because... the rval from > kill was being ignored (now the error gets syslog'ed). In most cases, fail-stop is a reasonable behavior for unexpected security behavior from the system, but ignore is likely to shoot you later. :-) I tend to wrap even kill() calls as uid 0 in an assertion check, just to be on the safe side. If nothing else, it helps detect the case where the other process has died, and you're using a stale pid. It's particular useful if the other process has died, the pid has been reused, and it's now owned by another user, which is a real-world case where kill() as a non-0 uid can fail even when you're sure it can't :-). Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories