Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 5 Apr 2001 17:06:17 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        bde@zeta.org.au (Bruce Evans)
Cc:        tlambert@primenet.com (Terry Lambert), rwatson@FreeBSD.ORG (Robert Watson), freebsd-arch@FreeBSD.ORG, dillon@FreeBSD.ORG
Subject:   Re: Eliminate crget() from nfs kernel code?
Message-ID:  <200104051706.KAA24246@usr01.primenet.com>
In-Reply-To: <Pine.BSF.4.21.0104051805550.46796-100000@besplex.bde.org> from "Bruce Evans" at Apr 05, 2001 06:18:44 PM

next in thread | previous in thread | raw e-mail | index | archive | help
> > [ ... crget() ... ]
> > 
> > I am not too happy with crget() at the moment.  Even discounting
> > the fact that it calls MALLOC(), and does not check the results
> > (the new [BAD] semantics permit this to fail under extremely low
> > memory conditions [FOR NO GOOD REASON] instead of hanging), it is
> 
> New [BAD] semantics for malloc(..., M_WAITOK) would require some
> dead bodies :-).  I haven't seen any.

Check every occurance of MALLOC() and the two lines immediately
following it in the kernel.

Some calls with M_WAITOK check for a NULL value for the assigned
pointer; others do not.

Someone is using it wrong.

I've personally seen a panic, preceeded by several warning messages
(which only occur with "INVARIANTS" turned on), under heavy network
load.

Specifically, write a small program that accepts and closes socket
connections, while on the other end, have a client which connects
as fast as possible.

This is (effectively) a simulation of the highest possible HTTP
connection traffic.  The messages complaining that the memory on
the free list has changed out from under it will occur when the
system load as a result of the server hits around 98.72%.  The
panic can occur anywhere.  However, it's easy to accelerate the
panic by running a program that sleeps 1 second, runs a grep for
something, and sleeps again.  A lot of the complaints are about
"was type cred".

If you end up exacerbating the panic with the grep shell script,
the pace it _invariably_ occurs is in grep calling access(2)
calling crdup(), where, after the allocation, the uihold() fails
on the copy of the credential with a "page not present" error
on the attempted increment through indirect (incx instruction).

When I check the return of the MALLOC(), I don't see a problem.

Obviously, someone is either freeing something twice, using
memory they are not allowed to use, or freeing something and then
continuing to use it.

Though there is a race condition that you could drive a truck
through in close(2) (the reference count is decremented to zero,
menaing that the descriptor could end up reused, prior to a lot
of things being done to the now zero referencs descriptor), a
gross closing of that hole indicates that it is not a close race
that is the problem, and that the problem has to be occuring in
the soclose() code, or in the credentials code.

Since this only happens under load, I am assuming an overflow,
where an allocation from the free list is not being checked for
a wrap.

So once again: the finger points to MALLOC()..



> > If you "fix" crget(), you will also need to fix crdup().  There
> > are plenty of places where crdup() is called, not just in the
> > access() system call, where it is bogusly used to replace _only_
> > the initial group of the real GID, leaving the groups of the
> > effective UID active, falsely yielding access to the file, even
> > if the real UID would have not have contained the same group list
> > as the effecive UID (gotta love "security" code).
> 
> This just how access() works.  It checks the access that you would
> have setting the IDs to the real ones.  Setting the IDs to the real
> ones has no effect on the groups list except possibly for removing/
> changing the effective GID if that is on the list.

The problem with this approach is that a priviledged process which
calls setgroups(2) using its effective user ID to permit it to make
the call will change the group set for the "real" credential, when
the access(2) call occurs.

The only way I can see around this is to keep separate cred structures
for real and effective IDs, and thus keep seperate groups, and pass
the "real" cred structure, instead of a hacked "effective" structure
down to do the work.

The problem with doing this is that there is not a [sg]etegroups(2) call.

Since setgroups(2) is a BSD-ism, it seems to me that the only real
way to fix this is to introduce two new BSD-isms for manipulating
effective group lists, which would operate on the effective cred's
grouplist, instead of on the gloabl group list..

I can't say I'm thrilled with the idea.  My personal implementation
choice would be to make [sg]etgroups(2) take an additional parameter,
and have user space make the call via [sg]et{e}groups(3), with the
second parameter selecting between effective and real.  This could
be made binary backward compatible using knowledge of the number of
real arguments on the stack to decide on the behaviour, if necessary.


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200104051706.KAA24246>