Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Jul 1999 11:48:46 -0400 (EDT)
From:      Robert Watson <robert@cyrus.watson.org>
To:        Nate Williams <nate@mt.sri.com>
Cc:        freebsd-security@FreeBSD.ORG
Subject:   Re: how to keep track of root users?
Message-ID:  <Pine.BSF.3.96.990715112817.27279D-100000@fledge.watson.org>
In-Reply-To: <199907151504.JAA02317@mt.sri.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 15 Jul 1999, Nate Williams wrote:

> [ I'm supposed to be on vacation ;( ]

Then don't read your email, or at least ignore it.  :-)
 
> > > Ahh, I understand now.  You are worried about one or the other of
> > > namei/audit copyin being redundant.  I misunderstood both you and
> > > Garrett.  Would it be possible to copy the string from the namei buffer,
> > > thus avoiding the issue of modifying namei?
> > 
> > I haven't looked closely at the namei implementation -- at some point the
> > entire string is dumped into a KTRACE record by namei()
> 
> Assuming we use KTRACE, which I though we decided wasn't up to the task
> due to many things.  Given that it's not up to it, let's not rely on it
> doing anything, or being reliant on it.

My point we merely that wherever KTRACE is doing it seems to be a good
place to do it--we definitely want only one copyin, and it really should
be the same copyin as namei() is using, to prevent races where shared
memory is in use on an SMP.  Otherwise two processes can deceive the
auditing mechanism: p1 and p2 share a memory segment.  p1 invokes chmod()
with a string argument in the shared secment.  The kernel, on behalf of
p1, copies in the auditing record from p1.  Then p2 modifies the value,
followed by the kernel reading the value in again for the purposes of
namei().  There are other similar nasties: you should use the same copied
in value in situations like read(), as otherwise the kernel can be
pursuaded to conveniently obliterate the data as a result of the syscall.
I don't really have anything to prove here, except that copying in
arguments is something that has to be done properly to minimize the
chances of races.

> > On
> > return from the syscall, it would tag the audit record with the return
> > code, error condition a timestamp, and commit the audit record.
> 
> But, we'd still need to instrument execve to get information such as the
> stat information from the inode (ownership, permissions, mount
> information, etc...).  This information only exists inside of execve()
> (unless we decided to do another stat on the file in the code, which
> seems absolutely silly since the information already exists.)
> 
> > It would then be the responsibility of the syscall code itself to
> > submit a list of arguments as appropriate, as well as any more
> > detailed subject/object information, etc.
> 
> Again, since we're already instrumenting the syscall code, let's do it
> all there, and provide some 'generic' stubs that each syscall can do on
> it's own, rather than do the re-direction in the kernel.
> 
> Again, I think we're in agreement on *what* needs to be done, but not on
> the specifics of how it needs to be done.
> 
> Your way adds a level of indirection that isn't obvious.  My way makes
> it obvious what is being done *PLUS* allows us to create a record in one
> fell swoop, instead of in pieces.
> 
> Ex:
> 
> execve()
> {
> ....
> #if _POSIX_AUD
>    audit_rec ac;
> 
>    ac = new ac(p);
>    ac.syscallNum = KERN_EXECVE;
> 
> #endif
> ....
> #if _POSIX_AUD
>    /* Totall bogus code */
>    ac.perms = vnode.perms;
>    ac.owner = vnode.uid;
>    ac.group = vnode.gid;
> #endif
> 
> Hopefully you get the point.  This entire record is created from start
> to finish inside the syscall, thus making it very obvious what
> information is gathered at the code level, rather than trying to run
> around the different parts of the kernel to figure out what is going on.

Having implemented a number of syscalls like this already, I found the
experience to be quite painful: every possible exit condition from the
syscall needs to have an individual commit with appropriate data, etc.
This means instead of fairly nice

i = copyin(...arg);
if (i)
	return(i);
ac_add_arg(ac, arg);

You end up with

i = copyin(...arg...);
if (i) {
	ac.errorcode = i;
	ac_commit(ac);
	return(i);
}
ac_add_arg(ac, arg);

If you look at most syscalls, there are lots of possible return points
based on a plethora of failures: chmod will copy in a string, namei to a
vnode, attempt to apply the operationg to the vnode.  

By the way, I think it'd possible for a syscall to return EFAULT without
ever getting into the syscall code itself.  It looks like syscall() in
trap.c performs the copyin on arguments specified by the function
prototype in syscalls.master:

syscall(frame)
        struct trapframe frame;
{
        caddr_t params;
        int i;
        struct sysent *callp;
        struct proc *p = curproc;
        u_quad_t sticks;
        int error;
        int args[8];
        u_int code;
...
        if (params && (i = callp->sy_narg * sizeof(int)) &&
            (error = copyin(params, (caddr_t)args, (u_int)i))) {
#ifdef KTRACE
                if (KTRPOINT(p, KTR_SYSCALL))
                        ktrsyscall(p->p_tracep, code, callp->sy_narg,
args);
#endif
                goto bad;
        }
#ifdef KTRACE
        if (KTRPOINT(p, KTR_SYSCALL))
                ktrsyscall(p->p_tracep, code, callp->sy_narg, args);
#endif
        p->p_retval[0] = 0;
        p->p_retval[1] = frame.tf_edx;

        STOPEVENT(p, S_SCE, callp->sy_narg);

        error = (*callp->sy_call)(p, args);

        switch (error) {
...

This means that to audit this case, we would have to create, set up, and
commit an audit record inside the syscall handling code anyway.

> > We indeed both violently agree that argument auditing, etc, must happen
> > inside the syscall.  But my temptation is to push a little of the
> > general-purpose work out of the syscall and into the handler.
> 
> See above.  I think by pushing it outside of the syscall we end up
> obfuscating (a little bit) of what's going on.  Making these easy to
> understand is of prime importance to me, because the chances of it being
> accepted and maintained are much greater.  If kernel writers think they
> don't have to do anything do make new syscalls work correctly in the new
> IDS setup, they won't.

I agree that ease of understanding is important, but I feel that retaining
code simplicity for syscall implementors is also important.  We get a big
stepping stone towards having implemented a lot of syscall auditing just
by putting in record creation and commital outside of the syscall, as we
already have return codes audited, as well as credentials.

I hope your vacation is going well, leaving aside my pestering :-).  My
plan is to not bring a notebook with my on my vacation to Maine in
mid-August for this very reason.

  Robert N M Watson 

robert@fledge.watson.org              http://www.watson.org/~robert/
PGP key fingerprint: AF B5 5F FF A6 4A 79 37  ED 5F 55 E9 58 04 6A B1
TIS Labs at Network Associates, Computing Laboratory at Cambridge University
Safeport Network Services



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-security" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.990715112817.27279D-100000>