Date: Tue, 04 Jan 2011 14:44:31 -0800 From: Julian Elischer <julian@freebsd.org> To: Jeff Roberson <jroberson@jroberson.net> Cc: arch@freebsd.org Subject: Re: Linux kernel compatability Message-ID: <4D23A2CF.1010904@freebsd.org> In-Reply-To: <alpine.BSF.2.00.1101041030120.1450@desktop> References: <alpine.BSF.2.00.1101031017110.1450@desktop> <20110103220153.69cf59e0@kan.dnsalias.net> <alpine.BSF.2.00.1101031859290.1450@desktop> <20110104082252.45bb5e7f@kan.dnsalias.net> <alpine.BSF.2.00.1101041030120.1450@desktop>
next in thread | previous in thread | raw e-mail | index | archive | help
On 1/4/11 12:53 PM, Jeff Roberson wrote: > On Tue, 4 Jan 2011, Alexander Kabaev wrote: > >> On Mon, 3 Jan 2011 19:03:01 -1000 (HST) >> Jeff Roberson <jroberson@jroberson.net> wrote: >> >>> On Mon, 3 Jan 2011, Alexander Kabaev wrote: >>> >>>> On Mon, 3 Jan 2011 10:31:24 -1000 (HST) >>>> Jeff Roberson <jroberson@jroberson.net> wrote: >>>> >>>>> Hello Folks, >>>>> >>>>> Some of you may have seen my infiniband work proceed in svn. It is >>>>> coming to a close soon and I will be integrating it into current. >>>>> I have a few patches to the kernel to send for review but I wanted >>>>> to bring up the KPI wrapper itself for discussion. >>>>> >>>>> The infiniband port has been done by creating a 10,000 line KPI >>>>> compatability layer. With this layer the vast majority of the >>>>> driver code runs unmodified. The exceptions are anything that >>>>> interfaces with skbs and most of the code that deals with network >>>>> interfaces. >>>>> >>>>> Some examples of things supported by the wrapper: >>>>> >>>>> atomics, types, bitops, byte order conversion, character devices, >>>>> pci devices, dma, non-device files, idr tables, interrupts, >>>>> ioremap, hashes, kobjects, radix trees, lists, modules, notifier >>>>> blocks, rbtrees, rwlock, rwsem, semaphore, schedule, spinlocks, >>>>> kalloc, wait queues, workqueues, timers, etc. >>>>> >>>>> Obviously a complete wrapper is impossible and I only implemented >>>>> the features that I needed. The build is accomplished by pointing >>>>> the linux compatible code at sys/ofed/include/ which has a >>>>> simulated linux kernel include tree. There are some config(8) >>>>> changes to help this along as well. >>>>> >>>>> I have seen that some attempt at similar wrappers has been made >>>>> elsewhere. I wonder if instead of making each one tailored to >>>>> individual components, which mostly seem to be filesystems so far, >>>>> should we put this in a central place under compat somewhere? Is >>>>> this project doomed to be tied to a single consumer by the specific >>>>> nature of it? >>>>> >>>>> Other comments or concerns? >>>>> >>>>> Thanks, >>>>> Jeff >>>> >>>> >>>> This probably will go against popular opinion here, but having 10k >>>> linux emulation layer that _almost_ work in the tree will be an >>>> unfortunate event and will do more damage to FreeBSD as a platform >>>> than good in the long run. I would rather see this code never hit >>>> main repository. >>> >>> I would argue that the layer works very well for infiniband. Much >>> better than almost. It is only almost complete in that there is no >>> need for me to implement features that we're not using. >>> >>> I am interested in hearing your other concerns however. >>> >>> Thanks, >>> Jeff >>> >> > > Alexander, let me first start out by saying I have a great deal of > respect for you and I hear your concerns. I see that this is a > somewhat heated issue and I can really only address the technical > points. The more existential questions about FreeBSD will have to > be left to others. > >> The considerations are simple enough. First, we do not have many IB >> users of FreeBSD in the wild and those that we have (Isilon) seem >> to be >> perfectly capable of managing the IB stack out of the tree, without >> dumping the thousands of lines of the code into the base. If they had >> the stack before, but were not willing/capable to provide adequate >> care >> for it in the past, there is no reason to expect things to change with >> second stack, which now will rot in our tree instead of theirs. > > They provided adequate care for it to keep their product running on > old versions of FreeBSD. Unfortunately it is a large stack and > there are a great number of people and organizations working on > improving and advancing it on Linux via OFED and having a private > stack does not give you the benefit of their work. The motivation > for making the wrapper layer was entirely to keep pace with this > development and make it less likely that what is in the tree will rot. > >> >> Second, semi-complete Linux compat layer in kernel will have the >> same effect as linuxulator in userland - we do have some vendors still >> trying to bother with FreeBSD drivers for their hardware now and we >> will have none after we provide the possibility to hack their Linux >> code to run somewhat stably on top of Linux compat layer. Due to >> intentional fluidity of Linux kAPI, our shims will never quite walk >> and >> quack like their original implementation in Linux kernel and combined >> result will always be lees stable than native Linux linux drivers in >> Linux kernel. > > I have heard this argument about the linuxulator and what we're > really talking about is slipping FreeBSD marketshare. I don't share > the view that the linuxulator futhered this slip but rather my view > is that it allows us to stay relevant in areas where companies can > not justify an independent FreeBSD effort. Adobe is a good example > of this. > > Let's talk nuts and bolts about what this thing does. In the vast > majority of cases it simply shuffles arguments and function names > around where there is a 1:1 correlation between linux api and > FreeBSD API. Think about things like atomics, callouts, locks, > jiffies vs ticks, etc. In these areas the systems are trivially > different. In a very small number of areas where this wasn't the > case I did a direct port and noted it with an #ifdef. > > This works specifically in the infiniband case because it is its own > middle layer. You can't write a scsi driver for linux and use it on > BSD with this. You can't write a network driver even. But if you > do bring in code from linux you don't have to worry about changing > every kmalloc to malloc and every printk to printf so diffs can be > reduced in trivial cases. I thought given your work on XFS for > FreeBSD that would make sense to you. > > Our options are, to leave FreeBSD users without infiniband, which I > can tell you has cost us more market share as I know of specific > cases we have lost due to it. To maintain our own stack > independently, which no one has the budget for. Or to try to > integrate with OFED. Do you see some other approach? As you may know Alexander and I both work for a company that produces a "large" driver that runs on everything from windows to FreeBSD and everything in between (linux, osx, esx, aix, solaris).. we have a porting layer (Alexander hates it I know). But it's not a freebsd to linux layer, it's a freebsd (or whatever) to 'internal' layer, (though it started out being the first). Looking at what we have and what you have it seems to me that we could take a subset of the basic CS101 methods that are there and make a linux driver porter's toolkit. things like linux style linked lists vs out queue macros can be addressed easily, but it's just a pain in the neck to do so.. also linux run queues and such are different but making a small set of toolkits to do so wouldn't be a bad idea. some of the more esoteric parts might stay with the ib code however. > > Thanks, > Jeff > >> >> >> -- >> Alexander Kabaev >> > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D23A2CF.1010904>