Date: Thu, 27 Jul 2000 16:13:08 -0400 (EDT) From: Robert Watson <rwatson@freebsd.org> To: Isaac Waldron <waldroni@lr.net> Cc: freebsd-hackers@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Writing device drivers (ioctl issue) Message-ID: <Pine.NEB.3.96L.1000727155220.96611E-100000@fledge.watson.org> In-Reply-To: <005301bff73b$bf8a3460$0100000a@waldron.house>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Jul 2000, Isaac Waldron wrote: > I started working on a port of FreeMWare/plex86 (www.plex86.org) to FreeBSD > yesterday, and have run into a small problem. The basic idea is that I need > to write a kernel module that implements some ioctls for a new psuedo-device > that will eventually reside at /dev/plex86. > > The issue I'm running into is with the function I'm writing to handle the > ioctls for the device. For one of the ioctls, the code needs to get some > data from the file descriptor that was passed to the original call to > ioctl(2). This is easily accomplished in linux, because the file descriptor > is passed as the second argument to the device_ioctl function. > > Is there an easy way to get at the same data (the file descriptor passed to > ioctl(2) by the calling program, in a kernel-style "struct file *", not the > standard "struct FILE *") in FreeBSD? Or will it be neccesary to change the > ioctl structure slightly and therefore need to change some of the higher > level functions in plex? I ran into this same problem when modifying the vmmon VMWare driver for FreeBSD to support mulitple emulator instances. FreeBSD's VFS does not have a concept of stateful file access: there are open's and close's, but the VOP_READ/WRITE operations are not associated with sessions. This influences the way in which drivers are implemented for BSD (and platforms like it.) For example, rather than having one /dev/bpf with multiple "open" instances, we have /dev/bpf{0,1,...}, and a process needing a session will sequentially attempt to open devices until it finds one that doesn't return EBUSY. The driver, in this case, limits the number of open references to 1. There are a number of possible solutions to this problem, including the Linux solution solution of passing the file descriptor down the VFS stack so that VFS layers can attach attach information to the file descriptor providing session information. In this manner, VOP_READ/WRITE can determine which session is active, and behave appropriately. I dislike this solution: right now, file descriptors are a property of the process and ABI, and the VFS is unaware of them. Having a stacked file system suggests that the single hook in the file descriptor is insufficient to maintain per-layer information associated with a session, also. It also makes a mess of access to files from within the kernel, where file descriptors are not used. My preferred solution, and I actually hacked around with a kernel a bit to do this, is to make the VFS provide (optional) stateful vnode sessions. vop_open() would gain an additional call-by-reference argument, probably a void**. When NULL, the caller would be requesting a stateless vnode open, and all would be as today. When non-NULL, this would allow the vnode provider to return a cookie/rock/void pointer to state information for the session. Other VOP's would similarly accept back this cookie, allowing the VOP provider to inspect it (if non-NULL) and behave appropriately with state. vop_close() could be used to release the cookie. This would provide the ability for file systems and callers to optionally make use of state, without violating the seperation of file descriptors/open file records and the VFS. It would also allow stacking to occur, as each vnode private data layer/layered cookie struct could do appropriate layer transformations to get the right cookie for the next layer down. I.e., there would be a sensical semantic for stacked file systems to provide stateful access. My changes are incomplete as I was working on it on the plane, and comments on the idea would be welcome. One thing this would allow is for us to not heavily replicate device nodes in /dev for multi-instance virtual devices. The BPF example is a useful one here: while the kernel currently supports dynamically allocated BPF devices, /dev has to have BPF entries manually added. The same goes for tunnel devices, et al. While a real devfs would fix this, the semantic is also useful for drivers ported from Linux (and other platforms with stateful vnode access) that expect to be able to open /dev/vmmon, and get a new unique session. For /dev/vmnet, it means the driver can detect multiple sessions on the same device, and act appropriately. In vmnet, each vmnet device acts like an ethernet bridge for the sessions open on it, so you can bind different VMWare sessions to different virtual network segments, potentially more than one VMWare session to each network segment. Or, you can binary-modify VMWare each run time to open a different /dev/{whatever}, or ge the developer to use a /dev/whatever{0,1,2,3...} model, for which there is much precedent in Linux (BPF, ttys, etc). Robert N M Watson robert@fledge.watson.org http://www.watson.org/~robert/ PGP key fingerprint: AF B5 5F FF A6 4A 79 37 ED 5F 55 E9 58 04 6A B1 TIS Labs at Network Associates, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1000727155220.96611E-100000>