Date: Mon, 08 Oct 2001 09:30:59 +0100 From: Brian Somers <brian@freebsd-services.com> To: Poul-Henning Kamp <phk@critter.freebsd.dk> Cc: Peter Wemm <peter@wemm.org>, Brian Somers <brian@Awfulhak.org>, freebsd-arch@FreeBSD.ORG, brian@freebsd-services.com Subject: Re: Cloned open support Message-ID: <200110080830.f988UxT40831@hak.lan.Awfulhak.org> In-Reply-To: Message from Poul-Henning Kamp <phk@critter.freebsd.dk> of "Mon, 08 Oct 2001 07:43:11 %2B0200." <95926.1002519791@critter.freebsd.dk>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi,
First, let me mention that I think your response is a bit odd....
timing wise. I posted the message that you've responded to on
January 29 and you responded on October 7. I implemented cloning on
if_tun in May... :*P
Maybe my mail server is under performing !
> In message <20011007221706.1ABAE3809@overcee.netplex.com.au>, Peter Wemm writes
> :
> >Poul-Henning Kamp wrote:
> >>
> >> Uhm, Brian...
> >>
> >> We have cloned devices already...
> >>
> >> What exactly is it that you want to implement ?
> >>
> >> Poul-Henning
> >
> >Devfs name cloning feels hackish to me. Having a seperate EVENTHANDLER()
> >for doing it feels .. just nasty. I'd much rather that we had a d_clone
> >devsw entry and/or a D_CLONE d_flags entry.
>
> Call it hackish, I call it elegant:
>
> * I didn't have to modify all the device drivers making them
> incompatible with anything anybody ever learned.
>
> * I didn't have to do long-haired vnode operations in cloning
> drivers, thus preserving the ability to do systematic SMP
> lock-boundaries at the cdevsw-> level.
>
> * It supports parameterized clone opens (ie: not just "/dev/pty",
> but also "/dev/ad0s1g" and even if somebody implemented it:
> "/dev/ccd,mirror,ad0s1f,ad1s1f" :-)
>
> * It is a *LOT* simpler than doing it by vnodes...
>
> The story is that by the time you reach devsw->open() you have
> committed the vnode and if you change the device at that time you
> need to unwind all the way back up to the association of the vnode
> with the dev_t, and wind all the way back down before the open can
> progress.
>
> (The fact that it is an EVENTHANDLER is just a matter of implementation,
> I didn't see a point to reimplementing the same functionality when
> EVENTHANDLERS already were available).
>
> >There are two types of cloning. One is to map some name "/dev/fd0135ds18h"
> >into a device node without having to flood /dev with all possible
> >permutations. The other is to support per-device "select next unit" style
> >opens. Presently these are both kludged into the EVENTHANDLER interface.
>
> Those two are actually the same kind of open Peter, semantically
> they both say "make me a device according to this wish: ``...'' and let
> me open it".
>
> >I think Brian wants to move the second part directly into the open handler
> >like it is done on most other OS's that support cloning. Personally,
> >I would be quite happy if we could do that.
>
> Most other devices have made a mess of their vnodes and drivers by
> doing so :-(
>
> The FreeBSD implementation completely sidesteps all the vnode hair
> by doing the cloning at namei() time instead of open time, this
> makes it much simpler and much more capable.
>
> If you do a vnode based cloning, it will not support your
> "/dev/fd0135ds18h" example above, unless you flood /dev with all
> possible entries.
>
> >I realize why it is done the way it is done now though. VOP_LOOKUP()
> >having to return a unique vnode for the device is a pain. (which is why
> >the clone is done during lookup, so that the correct vnode is found and
> >available). But understanding why doesn't mean that I dont wish that it
> >could be different. :-)
>
> Well, if devices lived at the filedescriptor level instead of at the vnode
> level, things would be different (but I havn't tried to implement that
> so I can't say for sure if it would actually be "better"...)
>
> >Doug mentions the hack in dev/streams/streams.c:
> > td->td_dupfd = fd;
> > return ENXIO;
> >.. this is nasty. :-)
>
> This is abuse, it should be rewritten.
>
> >I think the SVR4 clone driver uses something like this. It causes the
> >original namei / open attempt to fail (thus releasing the "common" vnode)
> >and then switching over to the *real* file/vnode at the last minute.
>
> We would have to do that as well in order to unwind the committed vnode
> and select another.
>
>
> I would like to request that nobody starts to commit a vnode based cloning
> (or API changes for it) until they actually have a working prototype.
> I've been there, done that and threw it away.
>
> The only reason I can see for adding vnode-based cloning would be if
> somebody can point out something they cannot do with namei-based
> cloning...
>
> <SHAMELESS PLUG>
> My BSDCONey and BSDCON talks would be very good places to ask questions
> about this :-)
> </SHAMELESS PLUG>
My feeling on the whole topic is that we now have a very workable
system with two drawbacks:
o The ``clone device'' doesn't turn up in /dev. This means that an
administrator cannot treat it as a filesystem object WRT
permissions - in fact, he can't even see it on the filesystem.
IMHO this causes namespace problems, but this is also quite
fixable. I'd like to talk about this at BSDConEurope.
o The SI_CHEAPCLONE stuff is easy to get wrong, and getting it
wrong opens up a bad DoS. Maybe the answer is that specinfos
that are returned from make_dev() during clone() have the
SI_CHEAPCLONE flag already set and a successful call to the
driver's d_open() clears SI_CHEAPCLONE ?
But this doesn't quite work with the tun device. The tun device
abuses this flag so that it can use dev_depends() to blow away
all of it's make_dev()s at module unload time.... It doesn't
want to destroy_dev() them at d_close() time because I'd prefer
that the administrator is able to ``touch /dev/tunX'' then
``chmod /dev/tunX'' at boot time.
A partially unrelated problem is that of tracking open devices from
inside a driver. I'm only mentioning this because the SI_CHEAPCLONE
flag makes this more difficult - it allows devfs to destroy_dev()
things when the driver isn't looking.... I don't think my SI_CHEAPCLONE
abuse in if_tun is correct. Maybe the right answer is to have devfs
notify the driver when it destroy_dev()s something ?
> Poul-Henning
> --
> Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG | TCP/IP since RFC 956
> FreeBSD committer | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
--
Brian <brian@freebsd-services.com> <brian@Awfulhak.org>
http://www.freebsd-services.com/ <brian@[uk.]FreeBSD.org>
Don't _EVER_ lose your sense of humour ! <brian@[uk.]OpenBSD.org>
To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200110080830.f988UxT40831>
