Date: Fri, 18 Jul 2014 01:55:38 +0200 From: Mateusz Guzik <mjguzik@gmail.com> To: freebsd-arch@freebsd.org Subject: current fd allocation idiom Message-ID: <20140717235538.GA15714@dft-labs.eu>
next in thread | raw e-mail | index | archive | help
The kernel has to deal with concurrent attempts to open, close, dup2 and use the same file descriptor. I start with stating a rather esoteric bug affecting this area, then I follow with a short overview of what is happening in general and a proposal how to change to get rid of the bug and get some additional enchancements. Interestingly enough turns out Linux is doing pretty much the same thing. ============================ THE BUG: /* * Initialize the file pointer with the specified properties. * * The ops are set with release semantics to be certain that the flags, type, * and data are visible when ops is. This is to prevent ops methods from being * called with bad data. */ void finit(struct file *fp, u_int flag, short type, void *data, struct fileops *ops) { fp->f_data = data; fp->f_flag = flag; fp->f_type = type; atomic_store_rel_ptr((volatile uintptr_t *)&fp->f_ops, (uintptr_t)ops); } This itself is fine, but it assumes all code obtaining fp from fdtable places a read memory barrier after reading f_ops and before reading anything else. As you could guess no code does that and I don't believe placing rmb's in several new places is the way to go. ============================ GENERAL OVERVIEW OF CURRENT STATE: fps are obtained and installed as follows: struct file *fp; int fd; if (error = falloc(&fp, &fd)) return (error); if (error = something(....)) { fdclose(fp, fd); fdrop(fp); return (error); } finit(fp, ....); fdrop(fp); return (0); After falloc fp is installed in fdtable, it has refcount 2 and ops set to 'badfileops'. if something() failed: fdclose() checks if it has anything to do. if so, it cleares fd and fdrops fp fdrop() clears the second reference, everything is cleared up if something() succeeded: finit() finishes initialization of fp fdrop() cleares the second reference. fp now has expected refcount of 1. Now a little complication: parallel close() execution: fd is recognizes as used. it is cleared and fdrop(fp) is called. if something() succeeded after close: fdrop() kills fp if something() failed after close: fdclose() concludes nothing to do fdrop() kill fp Same story with dup2. What readers need to do: - rmb() after reading fp_ops - check fp_ops for badfileops ============================ PROPOSAL: struct file *fp; int fd; if (error = falloc(&fp, &fd)) return (error); if (error = something(....)) { fdclose(fp, fd); return (error); } finit(fp, ....); factivate(fd, fp); return (0); After falloc fd is only marked as used, fp is NOT installed. fp is returned with refcount of 1 and is invisible to anyone but curthread. if something() failed: fdclose() marks fd as unused and kills fp if something() succeeded: finit() finishes initialization of fp factivate() sets fp to non-null with a barrier Now a little complication: parallel close() execution: since fp is null, fd is recognized as unused. EBADF is returned. The only problem is with dup2 and I believe it is actually a step forward. Let's assume fd was marked as used by falloc, but fp was not installed yet. dup2(n, fd) will see that fd is used. With current code there is no problem since there is fp to fdrop and it can proceed. With the proposal however, there is nothing to fdrop. Linux returns EBADF in this case which deals with the problem and does not seem to provide any drawbacks for behaving processes. So, differences to current approach: 1. fewer barriers and atomic operations 2. no need to check for f_ops type 3. new case when dup2 can return an error Note that 3 should not be a problem since Linux is doing this already. Also note current approach is not implemented correctly at the moment as it misses rmbs, although I'm unsure how much this matters in practice. Thoughts? -- Mateusz Guzik <mjguzik gmail.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140717235538.GA15714>