Date: Sun, 12 May 2013 14:01:58 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: John Baldwin <jhb@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, Bruce Evans <brde@optusnet.com.au> Subject: Re: svn commit: r250220 - head/sys/kern Message-ID: <20130512140150.E646@etaplex.bde.org> In-Reply-To: <201305101142.32086.jhb@freebsd.org> References: <201305031908.r43J8xnI094418@svn.freebsd.org> <201305061355.20826.jhb@freebsd.org> <20130510123842.B637@etaplex.bde.org> <201305101142.32086.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 10 May 2013, John Baldwin wrote: > On Thursday, May 09, 2013 11:05:46 pm Bruce Evans wrote: >> On Mon, 6 May 2013, John Baldwin wrote: >> ... >>> static int >>> vn_ioctl(fp, com, data, active_cred, td) >>> struct file *fp; >>> u_long com; >>> void *data; >>> struct ucred *active_cred; >>> struct thread *td; >>> { >>> struct vnode *vp = fp->f_vnode; >>> struct vattr vattr; >>> >>> switch (vp->v_type) { >>> case VREG: >>> case VDIR: >>> switch (com) { >>> case FIONREAD: >>> vn_lock(vp, LK_SHARED | LK_RETRY); >>> error = VOP_GETATTR(vp, &vattr, active_cred); >>> VOP_UNLOCK(vp, 0); >>> if (!error) >>> *(int *)data = vattr.va_size - fp->f_offset; >>> return (error); >>> case FIONBIO: >>> case FIOASYNC: >>> return (0); /* XXX */ >>> default: >>> return (VOP_IOCTL(vp, com, data, fp->f_flag, >>> active_cred, td)); >>> } >>> default: >>> return (ENOTTY); >>> } >>> } > ... >>> (The 'XXX' comment could perhaps be expanded to something along the lines of >>> 'Allow fcntl() to toggle FNONBLOCK and FASYNC.') >> >> Is that what it is about? IIRC, upper layers do some partial handling >> and then call lower layers to possibly do some more handling. But here >> and in some other places there is nothing more to be done. No comment >> is needed, but maybe the XXX's were reminders to clean up the layering. >> FreeBSD has cleaned up the layering a bit, by using differnt fops for >> different file types. > > The problem is this code in fcntl() which I ran into when working on a > tutorial for writing character devices: > > case F_SETFL: > error = fget_unlocked(fdp, fd, CAP_FCNTL, F_SETFL, &fp, NULL); > if (error != 0) > break; > do { > tmp = flg = fp->f_flag; > tmp &= ~FCNTLFLAGS; > tmp |= FFLAGS(arg & ~O_ACCMODE) & FCNTLFLAGS; > } while(atomic_cmpset_int(&fp->f_flag, flg, tmp) == 0); > tmp = fp->f_flag & FNONBLOCK; > error = fo_ioctl(fp, FIONBIO, &tmp, td->td_ucred, td); > if (error != 0) { > fdrop(fp, td); > break; > } > tmp = fp->f_flag & FASYNC; > error = fo_ioctl(fp, FIOASYNC, &tmp, td->td_ucred, td); > if (error == 0) { > fdrop(fp, td); > break; > } > atomic_clear_int(&fp->f_flag, FNONBLOCK); > tmp = 0; > (void)fo_ioctl(fp, FIONBIO, &tmp, td->td_ucred, td); > fdrop(fp, td); > break; > > Hmm, this seems to have the bug that if you had FNONBLOCK set and > tried to set FASYNC via fcntl() but FIOASYNC isn't supported, > FNONBLOCK is cleared. It seems we should only clear FNONBLOCK > if it wasn't set in 'flg'. I think this would fix that: sys_socket.c mishandles this differently by voiding the result of all fo_ioctls(). The above only assumes that since the first fo_ioctl() for FIONBIO succeeded, the second one will too. I now remember more about this bad code: - it is still very broken for devices. ioctls are inherently per-device, but file flags aren't even per-file -- they are per-file-descriptor -- so the device state becomes inconsistent with the file descriptor state if more than 1 fd is open on a device and any non-null change of the file flags is made using one of the fd's (applications might be able to keep the file flags fairly consistent by doing fcntl for all the fd's (including ones shared across processes), but the file flags would be at least transiently inconsistent. Only the FNONBLOCK (O_NONBLOCK) and FASYNC (O_ASYNC) flags are passed down. O_NONBLOCK works right for some devices because it is also passed to open() and read()/write() (so it will work right for ioctl() if the device driver ignores it then). There are related not so bad cases for F_GETFL, F_GETOWN and F_SETOWN. F_GETFL returns per-fd flags. There are no per-fd ownerships, so F_GETOWN has to use an ioctl to fetch a non-per-fd ownership (the one set by F_SETOWN) which is at least consistently not per-fd. Most drivers/file types are actually non-broken for FIONBIO and do nothing except return 0 for it (I only grepped for FIONBIO; not for O_NONBLOCK or FNONBLOCK). This return is sometimes XXX'ed, but it really shouldn't be. Returning 0 and not doing anything means that the driver supports O_NONBLOCK correctly (by checking it per-fd in the flag passed to open() and read/write()). It is the drivers that do something which deserve an XXX. The only broken ioctl routines for FIONBIO are now: - subr_bus.c: devioctl() - sys_socket.c: soo_ioctl() - pcm/dsp.c: dsp_ioctl() - audit/audit_pipe.c: audit_pipe_ioctl() - cbus/fdc.c: fdioctl(). This is different from fdc/fdc.c. The latter just returns 0 and claims that this is for backwards compatibility with old utilities. Apparently the pc98 version hasn't been moved to userland as much as the pc98 version. But the comment is wrong, since old utilities are just broken since they depend on kernel features that were removed. IIRC, not much more than soo_ioctl() had this bug in 4.4BSD, except F_SETOWN as more complicated and broken (but not supported for as many devices/file types). truckman@ did a lot of work to fix F_SETOWN and assoicated signal handling. > Index: kern_descrip.c > =================================================================== > --- kern_descrip.c (revision 250451) > +++ kern_descrip.c (working copy) > @@ -539,7 +539,7 @@ kern_fcntl(struct thread *td, int fd, int cmd, int > } > tmp = fp->f_flag & FASYNC; > error = fo_ioctl(fp, FIOASYNC, &tmp, td->td_ucred, td); > - if (error == 0) { > + if (error == 0 || (flg & FNONBLOCK)) { > fdrop(fp, td); > break; > } > > This would mean you would no longer have to support the FIOASYNC > ioctl just to allow FIONBLOCK to be set. Looks good, but... > Note, btw, that kern_ioctl() doesn't enforce quite the same > requirement as F_SETFL in that FIONBIO doesn't check for FASYNC and vice > versa. ioctl() also doesn't revert the change if the backing fo_ioctl > method fails. ... it's bogus to back out of just the FIONBIO setting. Other flags changes (to O_APPEND and O_DIRECT) are not backed out of. Apparently the idea in fcntl was to back out of all changes. This may have even worked before O_APPEND was supported. O_DIRECT didn't exist until later. ioctl() is too complicated to have ever attempted this. This causes problems for applications programming: I don't know what the spec for fcntl() or ioctl() says, but for tcsetattr() (which is implemented using ioctl() in FreeBSD), POSIX says that the syscall shall (?) succeed if at least one attribute was "set" (changed?). So when you ask tcsetattr() to change lots of attributes, the only correct way to determine if it succeeded is to call tcgetattr() and check that all the changes that you care about actually occurred). > I actually think that fcntl should probably ignore ENOTTY errors > if the flag is not set. Otherwise using fcntl to toggle some other > flag can succeed but still return an error. I think ENOTTY errors for FIONBIO are rare, since too many places copy the code that returns 0 for FIONBIO even when O_NONBLOCK is not supported. E.g., the non-pc98 fdc driver doesn't support O_NONBLOCK, but it has a compatibiity excuse for claiming support. I think fo_ioctl() shouldn't be called for null changes (especially from the unset case to the unset case). Calling it should fail quite often and thus give ENOTTY even for callers that have no interest in FNONBLOCK or FASYNC but want to toggle some other changes. However, always calling it minimises the inconsistencies for drivers(...) that act on it -- this syncs the driver state with the fd state from the last fcntl() on the set of fd's referencing the device, irrespective of whether the current fcntl() is toggling the state. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130512140150.E646>