Date: Tue, 12 Aug 2008 02:36:20 +1000 (EST) From: Bruce Evans <brde@optusnet.com.au> To: John Baldwin <jhb@FreeBSD.org> Cc: cvs-src@FreeBSD.org, src-committers@FreeBSD.org, Ed Schouten <ed@FreeBSD.org>, cvs-all@FreeBSD.org, Bruce Evans <brde@optusnet.com.au> Subject: Re: cvs commit: src/sys/dev/io iodev.c Message-ID: <20080812014937.E21092@besplex.bde.org> In-Reply-To: <200808091555.25020.jhb@freebsd.org> References: <200808081343.m78DhwYE068477@repoman.freebsd.org> <200808081226.32089.jhb@freebsd.org> <20080809130929.P77335@delplex.bde.org> <200808091555.25020.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 9 Aug 2008, John Baldwin wrote: > On Saturday 09 August 2008 12:35:50 am Bruce Evans wrote: >> On Fri, 8 Aug 2008, John Baldwin wrote: >>> On Friday 08 August 2008 09:43:56 am Ed Schouten wrote: >>>> ed 2008-08-08 13:43:56 UTC >>>> >>>> Apart from this change, I think some fishy things may happen when >>>> using /dev/io in multithreaded applications. I haven't tested, but >>>> looking at the code, the flag doesn't get cleared when close() is called >>>> from another thread, but this may not be this important. >> >> Of course it isn't. The flag only gets cleared on last close. Threads >> probably involve mainly file descriptors, and for file descriptors close() >> doesn't even reach vn_close() until the fd reference count drops to 0. >> Then for files, vn_close() normally doesn't reach device close until the >> file reference count drops to 0. >> >>> It should be setting D_TRACKCLOSE though so that close() reliably clears >>> the flag even in single-threaded processes. You can still get odd >>> behavior if >> >> You mean "should _not_ be setting D_TRACKCLOSE", so that close() works >> normally. >> >> close() can never reliably clear the flag, since even vn_close() is not >> reached after open()/dup()/close() or open()/fork()/close()... >> >>> you explicitly open it twice in an app and then close one of the two >>> fd's. You will no longer have IO permission even though you still have >>> one fd open. However, if you do that I think you deserve what you asked >>> for. :) >> >> You asked for normal last-close behaviour and deserve that not being broken >> by setting D_TRACKCLOSE. >> >> D_TRACKCLOSE allows different handling of non-last closes, but this is >> rarely wanted, and in theory it allows better determination of last >> closes, since vfs doesn't count last closes right. However, it is >> difficult to count last closes right, and most uses of D_TRACKCLOSE handle >> last closes are worse than vfs: >> - ast: uses D_TRACKCLOSE to trash (write a filemark) and/or rewind tapes >> on non-last closes. After the trash and rewind, astclose() uses >> count_dev() to avoid doing a full hardware close if the device is still >> open. This involves using count_dev(), but count_dev() is just an >> interface to the vfs count, so it miscounts in the same way as vfs. In >> particular, it doesn't count incompleted opens. I don't know if ast's >> locking is sufficient to prevent close() being called while a new open >> is in progress. For most devices, it isn't. >> - drm: it uses its own count of opens and closes, and it doesn't use >> count_dev(). The own count has some chance of working, but needs >> delicate locking of device open and device close to prevent them >> running into each other, and some defense against the vfs bugs which >> at least used to result in missing and/or extra closes. >> - smb: like drm, except it limits its own count to 0 or 1 to give half- >> baked exclusive access (fork() and dup(), etc, still give non-exclusive >> access). D_TRACKCLOSE thus has no effect except for its interaction >> with other bugs. >> - geom: unlike most or all of the others, this may actually need >> D_TRACKCLOSE, to implement multiple logical drivers per physical device. >> I don't understand its details. >> - apm: like drm >> - bpf: like smb > > You failed to note that not using D_TRACKCLOSE doesn't actually reliable close > devices. No, I noted that vfs doesn't count last closes right. Unreliability of last-close is a special case of that. But using D_TRACKCLOSE mostly makes things worse. To avoid the vfs bugs in tracked closes, it is necessary for individual drivers to understand vfs's bugs better than vfs does. > I set D_TRACKCLOSE on bpf because under load at work (lots of > concurrent nmap's) bpf devices would sometimes never get fully closed, so > you'd have an unopened bpf device that was no longer available. It is unclear how that helps. vfs's bugs must be really large for the last-close to go completely missing. kib fixed some of them a few months ago (before you changed bpf I think). bpf is especially simple since it is exclusive-accesses, so the vfs bugs should be smaller for it. > D_TRACKCLOSE > gives much saner semantics for devices, each open() of an fd calls > the 'd_open' method, and the last close of that fd (which could be in another > process) calls 'd_close'. This is insane semantics for most devices, and D_TRACKCLOSE doesn't give it. Most devices don't care about the difference between file descriptors and files -- they want d_close to be called only when the last file descriptor on the device is closed. !D_TRACKCLOSE gives this, modulo the vfs bugs. D_TRACKCLOSE doesn't give these semantics: revoke() closes everything in 1 step and then calls d_close(), giving an N-1 correspondence between d_open()s and d_close()s (modulo the vfs bugs giving a fuzzier correspondence). (IIRC, at least before kib's fixes, it was possible for revoke() to cause any of 0, 1 or 2 calls to d_close(). I think 1 call always gets as far as devfs_close() but this is only guaranteed to reaach d_close() with D_TRACKCLOSE.) This is not a bug, but all drivers that depend on the correspondence being 1-1 are broken. Drivers that enforce "exclusive" access like bpf have N=1, so they probably work modulo the vfs bugs, and they have a good chance of not being affected by the vfs bug that gives an extra devfs_close() and thus an extra d_close(). > /dev/io in particular is quite bogus since even though a child process > inherits the file descriptor, it doesn't inherit the behavior of > having /dev/io open. Not just child processes. The same process disinherits the behaviour on exec though it normally keeps the fd open. Bruce
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080812014937.E21092>