From owner-freebsd-chat Thu Feb 14 16:15:46 2002 Delivered-To: freebsd-chat@freebsd.org Received: from scaup.prod.itd.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by hub.freebsd.org (Postfix) with ESMTP id 03D2137B405 for ; Thu, 14 Feb 2002 16:15:36 -0800 (PST) Received: from pool0453.cvx21-bradley.dialup.earthlink.net ([209.179.193.198] helo=mindspring.com) by scaup.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16bW24-00015e-00; Thu, 14 Feb 2002 16:15:20 -0800 Message-ID: <3C6C530E.B391A3FB@mindspring.com> Date: Thu, 14 Feb 2002 16:15:10 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Carlos Ugarte Cc: "Gary W. Swearingen" , freebsd-chat@FreeBSD.ORG Subject: Re: How do basic OS principles continue to improve? References: <20020213192510.A46224@dogma.freebsd-uk.eu.org> <15468.8546.298786.500178@pc-ugarte.research.att.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Carlos Ugarte wrote: [ ... compiler stuff ... ] [ ... "Synthesis"... ] [ ...... ] Heh. My first thought was the partial linking of shared objects in the University of Utah compiler work, which cached prelinked shared objects for reuse; among other things, this makes the relocation table, which is normally reinstanced for each program linked with a shared library, a sharable resource. The current work has grown into "Khazana", which is (mostly) unrelated: http://www.cs.utah.edu/khazana/ The original stuff was also early 1990's. 8-). > More generally, my impression is the same as that posted by Terry. > Most cutting edge research is done by small groups in experimental > environments. It takes a while for their work to propagate to more > popular systems. For example, I believe the KSE work is based in part > on the work done at the University of Washington in the early 90s > ("Scheduler Activations"). The current implementation is, yes. The original idea was to do some even more basic work than that, but it devolved to an activations implementation when it came time to cut code. The original approach, async call gates, would have required losing most compatability with traditional UNIX systems, with the POSIX interface becoming a library on top of the async interface; it would not have been that big a deal, though: it would have been possible to implement using an additional system call parameter, whose value was NULL for sync calls. Ironically, the idea comes out of some DECWRL work that found its way into DEC OSs of the late 70s/early 80s. > Another example, found on an article posted today on cnn.com - > Microsoft's Farsite system (can't tell if it's expected in 2006 or in > ten years) will make use of "experimental operating system technology > called Byzantine fault-tolerant protocols". Though work on such > protocols continues even today, Byzantine faults were first identified > some 20 years ago. Yep. You should have been at Novell back in 1994, when Drew Major "invented" virtual memory in NetWare. 8-) 8-). The cycle time for ideas to distribution in code is very long. The LRP stuff I did at ClickArray was fairly cutting edge: the Rice University implementation was fragile, and did not include a lot of the work necessary to be able to use the code in a commercial project. Their newer stuff was a bit more advanced, but contained distribution because of the terrible license. Both sets of code defined a new protocol family under which an alternate TCP and IP stack executed, so that you could run the old and new code in parallel, but that made the code doubly useless for real work. The bottom line is that it usually takes a *long* time for research to make it into deployment. It's amusing how much the "Khazana" idea is finding footing (it's intended as "An Infrastructure for Building Distributed Services", which falls right in with the Microsoft Byzantine work). Actually, since Microsoft just "acquired" Ajaya Chitturi from the University of Utah following graduation, that's not incredibly surprising. 8-). > On a different note, there seems to be less emphasis on building new > research systems from scratch; it is more and more common to see the > experimental environments I mentioned above make use of systems such > as FreeBSD and Linux (NetBSD, OpenBSD and the various Microsoft > products aren't as prominent). It's true that "free" OSs are being used more often as a research platform; PSC and Rice tend to use FreeBSD and NetBSD a lot, and other useful work tends to be BSD based these days (IPv6 out of KAME via a relationship through the University of Tokyo, etc.). There's a big danger, though, in using oddly licensed code. For example, research done in the Linux kernel will find itself under the GPL, and will be unlikely to be incorporated in the next generation of Cisco routers, at least without a rewrite from scratch. Such rewrites have historically found themselves to be non-interoperable (c.v. the revisions that the Linux TCP/IP stack has had to fight its way through to find full interoperability). > In these cases the "propagation lag" > can be cut substantially, if the project leads are aware of the > research and deem it worthy of being merged into the official tree. Yes and no. It takes incredible enlightenment to cut the *right* "propagation lag", rather than the *wrong*. For example, the Pittsburgh Supercomputing Center out of Carnegie Mellon University has a *vastly* superior SACK/FACK implementation, and has another innovation that is related to that called "Rate Halving": http://www.psc.edu/networking/rate_halving.html Yet FreeBSD integrated code derived from other sources which was clearly inferior, just to "keep up with the Joneses". In another example, we have the Ganger/Patt work on Soft Updates; their "Appendix A" of their paper was a System V implementation with the System V derived code stripped out. At Artisoft, back in 1995, we (mostly Matt Day) ported it to Windows, as part of a project to move the Heidemann stacking FS framework to Windows (initially, we did the FFS/UFS port to get around the FAT limitations, and then were FUD'ed out of business by Microsoft by the promise of VFAT32, which was not delivered for another two years, while we could have had an NTFS and NFS implemenetation for Windows within the same framework). The Soft updates code in the Ganger/Patt paper was ground breaking work: it leapfrogged the patented DOW (Delayed Ordered Writes) USL technology of 1993/1994, but the BSD implementation is, effectively, just a rigorous port/rewrite of the Appendix A code. It fails to take into account the propagation of dependency edges between stacking layers, so it's practically useless in a stacking environment since it can't, for example, export a transactioning interface to user space, and you can't add additional edge/resolver pairs in order to support non-FFS/UFS file systems, or to support e.g. dependencey reoslution between underlying FFS/UFS for a stacked cryptographic layer on top. Like the Rice University LRP code, it's more of a laboratory model, than a product. The "snapshots" code counts, in my book, as much more of an original work... but we see its origins in WAFL FS from Network Appliance, which has had the technology for a very long time now, to facilitate backups (in fact, snapshots were one of the original outgrowths of LFS technology, since they are almost naturally emergent as a result). So, frankly, I don't see the lag to implementation in usable systems decreasing in the Open Source projects, any more than it is decreasing in the commercial. > If you're interested in seeing what kind of stuff is considered > "cutting edge research" you might look for the proceedings of various > conferences and workshops. SOSP, OSDI, USENIX (Technical) and HotOS > would be the ones I'd look at, though there are many others. Or the Proceedings of the IEEE, or NEC Cite Seer: http://citeseer.nj.nec.com/cs Or any of dozens of CS department research sites. The FS stacking in FreeBSD, for example, came out of work at UCLA's FICUS project, in 1991/1992. I'm often incredibly frustrated at the lack of familiarity with the literature in fields of endeavor, where people end up taking years to reinvent a wheel, and then expect to be lauded for it. This is nowhere more apparent than in the Open Source SMP work to date, where some of the approaches used were discarded by researchers as early as 1993 (e.g. the use of Interrupt "threads" for processing work on the top end as a result of a bottom end interrupt, which results in cache busting and increased overall latency; other examples abound). I think it would take a project a concerted effort to actually *be* cutting edge, as opposed to merely *looking* cutting edge, when compared to commercial counterparts, like Microsoft. Commercial products are most often *far* from cutting edge, since disruptive technologies tend to destabilize natural monopolies (or unnatural ones ;^)). Actually, there's a very good book on this subject: The Innovator's Dilemma Claytom M. Christensen HarperBusiness ISBN: 0-06-662069-4 I highly recommend it for any technologiust who finds themselves frustrated by business processes appearing to be there to stifle their work; in fact, many *are* there for *precisely* that reason, and the businesses with them are successful *because* of this, not *in spite of* it. At the very least it's a fascinating read, and at best, it will show you how to portray innovation as enlightened self interest, and give you a number of techniques to use in pursuit of it (but be warned: many of them involve getting spun off from your cushy company). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message