Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Jul 2008 16:26:19 +0100 (BST)
From:      Robert Watson <rwatson@FreeBSD.org>
To:        Sergey Babkin <babkin@verizon.net>
Cc:        arch@FreeBSD.ORG, David Schultz <das@FreeBSD.ORG>, Poul-Henning Kamp <phk@phk.freebsd.dk>
Subject:   Re: Re: Re: Proposal: a revoke() system call
Message-ID:  <20080708161802.N89342@fledge.watson.org>
In-Reply-To: <20080708001929.E63144@fledge.watson.org>
References:  <9484951.340521215467447990.JavaMail.root@vms126.mailsrvcs.net> <20080708001929.E63144@fledge.watson.org>

next in thread | previous in thread | raw e-mail | index | archive | help

On Tue, 8 Jul 2008, Robert Watson wrote:

> Which can be accomplished by calling dup2(2) to replace the file descriptor 
> with another file descriptor, perhaps one to /dev/null.  It would be worth 
> carefully reviewing the implementation of dup2(2) to make sure that the 
> close->replace there is atomic with respect to other threads simultaneously 
> allocating file descriptors, such as with pipe(2).

BTW, on a similar note to the above: I've noticed there are several spots of 
relative non-atomicity in the Linux emulation code, where rather than just 
wrapping existing system calls with binary conversion of arguments and return 
values, we do a semantic wrapping that is necessarily non-atomic with respect 
to the native code.  For example, consider the Linuxulator open code in 
linux_common_open():

134     error = kern_openat(td, dirfd, path, UIO_SYSSPACE, bsd_flags, mode);
135
136     if (!error) {
137             fd = td->td_retval[0];
138             /*
139              * XXX In between kern_open() and fget(), another process
140              * having the same filedesc could use that fd without
141              * checking below.
142              */
143             error = fget(td, fd, &fp);
144             if (!error) {
145                     sx_slock(&proctree_lock);
146                     PROC_LOCK(p);
147                     if (!(bsd_flags & O_NOCTTY) &&
148                         SESS_LEADER(p) && !(p->p_flag & P_CONTROLT)) {
149                             PROC_UNLOCK(p);
150                             sx_unlock(&proctree_lock);
151                             if (fp->f_type == DTYPE_VNODE)
152                                     (void) fo_ioctl(fp, TIOCSCTTY, 
(caddr_t) 0,
153                                              td->td_ucred, td);
154                     } else {
155                             PROC_UNLOCK(p);
156                             sx_sunlock(&proctree_lock);
157                     }
158                     if (l_flags & LINUX_O_DIRECTORY) {
159                             if (fp->f_type != DTYPE_VNODE ||
160                                 fp->f_vnode->v_type != VDIR) {
161                                     error = ENOTDIR;
162                             }
163                     }
164                     fdrop(fp, td);
165                     /*
166                      * XXX as above, fdrop()/kern_close() pair is racy.
167                      */
168                     if (error)
169                             kern_close(td, fd);
170             }
171     }

I think that comment is mine, or at least, got there because of a comment I 
made to Roman or the like.  The fd has not yet been explicitly returned to 
userspace, since the open system call hasn't actually returned, but other 
threads could use the file descriptor in a system call that could lead to 
unexpected races.  For example, if you dup2() on top of the file descriptor 
between the return of kern_openat() and the invocation of fget(), fo_ioctl() 
might be called on the wrong file, or the kern_close() in the error case might 
get invoked on the "wrong" file descriptor.

In these cases, the races are mostly harmless since they involve incorrectly 
using a file descriptor from a second thread -- since it hasn't been returned 
yet, it isn't valid yet and the results will be undefined.  However, there may 
well be cases where similar races exist that do affect the semantics of 
multi-threaded Linux applications, such as having a main event (open()) and an 
associated event (fo_ioctl()) be non-atomic and allowing a race between them 
that does have a semantically problematic result.

These sorts of edge cases, btw, are one reason why I would *strongly* 
discourage application writers from doing things like calling close(2) on a 
file descriptor while still using it from another thread. :-)

Robert N M Watson
Computer Laboratory
University of Cambridge



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080708161802.N89342>