From owner-freebsd-arch@FreeBSD.ORG Tue Jan 4 22:44:10 2011 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29C8F106564A for ; Tue, 4 Jan 2011 22:44:10 +0000 (UTC) (envelope-from julian@freebsd.org) Received: from out-0.mx.aerioconnect.net (out-0-29.mx.aerioconnect.net [216.240.47.89]) by mx1.freebsd.org (Postfix) with ESMTP id 082E78FC08 for ; Tue, 4 Jan 2011 22:44:09 +0000 (UTC) Received: from idiom.com (postfix@mx0.idiom.com [216.240.32.160]) by out-0.mx.aerioconnect.net (8.13.8/8.13.8) with ESMTP id p04Mi8en004296; Tue, 4 Jan 2011 14:44:08 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (h-67-100-89-137.snfccasy.static.covad.net [67.100.89.137]) by idiom.com (Postfix) with ESMTP id 379072D6012; Tue, 4 Jan 2011 14:44:06 -0800 (PST) Message-ID: <4D23A2CF.1010904@freebsd.org> Date: Tue, 04 Jan 2011 14:44:31 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X 10.4; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: Jeff Roberson References: <20110103220153.69cf59e0@kan.dnsalias.net> <20110104082252.45bb5e7f@kan.dnsalias.net> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.67 on 216.240.47.51 Cc: arch@freebsd.org Subject: Re: Linux kernel compatability X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jan 2011 22:44:10 -0000 On 1/4/11 12:53 PM, Jeff Roberson wrote: > On Tue, 4 Jan 2011, Alexander Kabaev wrote: > >> On Mon, 3 Jan 2011 19:03:01 -1000 (HST) >> Jeff Roberson wrote: >> >>> On Mon, 3 Jan 2011, Alexander Kabaev wrote: >>> >>>> On Mon, 3 Jan 2011 10:31:24 -1000 (HST) >>>> Jeff Roberson wrote: >>>> >>>>> Hello Folks, >>>>> >>>>> Some of you may have seen my infiniband work proceed in svn. It is >>>>> coming to a close soon and I will be integrating it into current. >>>>> I have a few patches to the kernel to send for review but I wanted >>>>> to bring up the KPI wrapper itself for discussion. >>>>> >>>>> The infiniband port has been done by creating a 10,000 line KPI >>>>> compatability layer. With this layer the vast majority of the >>>>> driver code runs unmodified. The exceptions are anything that >>>>> interfaces with skbs and most of the code that deals with network >>>>> interfaces. >>>>> >>>>> Some examples of things supported by the wrapper: >>>>> >>>>> atomics, types, bitops, byte order conversion, character devices, >>>>> pci devices, dma, non-device files, idr tables, interrupts, >>>>> ioremap, hashes, kobjects, radix trees, lists, modules, notifier >>>>> blocks, rbtrees, rwlock, rwsem, semaphore, schedule, spinlocks, >>>>> kalloc, wait queues, workqueues, timers, etc. >>>>> >>>>> Obviously a complete wrapper is impossible and I only implemented >>>>> the features that I needed. The build is accomplished by pointing >>>>> the linux compatible code at sys/ofed/include/ which has a >>>>> simulated linux kernel include tree. There are some config(8) >>>>> changes to help this along as well. >>>>> >>>>> I have seen that some attempt at similar wrappers has been made >>>>> elsewhere. I wonder if instead of making each one tailored to >>>>> individual components, which mostly seem to be filesystems so far, >>>>> should we put this in a central place under compat somewhere? Is >>>>> this project doomed to be tied to a single consumer by the specific >>>>> nature of it? >>>>> >>>>> Other comments or concerns? >>>>> >>>>> Thanks, >>>>> Jeff >>>> >>>> >>>> This probably will go against popular opinion here, but having 10k >>>> linux emulation layer that _almost_ work in the tree will be an >>>> unfortunate event and will do more damage to FreeBSD as a platform >>>> than good in the long run. I would rather see this code never hit >>>> main repository. >>> >>> I would argue that the layer works very well for infiniband. Much >>> better than almost. It is only almost complete in that there is no >>> need for me to implement features that we're not using. >>> >>> I am interested in hearing your other concerns however. >>> >>> Thanks, >>> Jeff >>> >> > > Alexander, let me first start out by saying I have a great deal of > respect for you and I hear your concerns. I see that this is a > somewhat heated issue and I can really only address the technical > points. The more existential questions about FreeBSD will have to > be left to others. > >> The considerations are simple enough. First, we do not have many IB >> users of FreeBSD in the wild and those that we have (Isilon) seem >> to be >> perfectly capable of managing the IB stack out of the tree, without >> dumping the thousands of lines of the code into the base. If they had >> the stack before, but were not willing/capable to provide adequate >> care >> for it in the past, there is no reason to expect things to change with >> second stack, which now will rot in our tree instead of theirs. > > They provided adequate care for it to keep their product running on > old versions of FreeBSD. Unfortunately it is a large stack and > there are a great number of people and organizations working on > improving and advancing it on Linux via OFED and having a private > stack does not give you the benefit of their work. The motivation > for making the wrapper layer was entirely to keep pace with this > development and make it less likely that what is in the tree will rot. > >> >> Second, semi-complete Linux compat layer in kernel will have the >> same effect as linuxulator in userland - we do have some vendors still >> trying to bother with FreeBSD drivers for their hardware now and we >> will have none after we provide the possibility to hack their Linux >> code to run somewhat stably on top of Linux compat layer. Due to >> intentional fluidity of Linux kAPI, our shims will never quite walk >> and >> quack like their original implementation in Linux kernel and combined >> result will always be lees stable than native Linux linux drivers in >> Linux kernel. > > I have heard this argument about the linuxulator and what we're > really talking about is slipping FreeBSD marketshare. I don't share > the view that the linuxulator futhered this slip but rather my view > is that it allows us to stay relevant in areas where companies can > not justify an independent FreeBSD effort. Adobe is a good example > of this. > > Let's talk nuts and bolts about what this thing does. In the vast > majority of cases it simply shuffles arguments and function names > around where there is a 1:1 correlation between linux api and > FreeBSD API. Think about things like atomics, callouts, locks, > jiffies vs ticks, etc. In these areas the systems are trivially > different. In a very small number of areas where this wasn't the > case I did a direct port and noted it with an #ifdef. > > This works specifically in the infiniband case because it is its own > middle layer. You can't write a scsi driver for linux and use it on > BSD with this. You can't write a network driver even. But if you > do bring in code from linux you don't have to worry about changing > every kmalloc to malloc and every printk to printf so diffs can be > reduced in trivial cases. I thought given your work on XFS for > FreeBSD that would make sense to you. > > Our options are, to leave FreeBSD users without infiniband, which I > can tell you has cost us more market share as I know of specific > cases we have lost due to it. To maintain our own stack > independently, which no one has the budget for. Or to try to > integrate with OFED. Do you see some other approach? As you may know Alexander and I both work for a company that produces a "large" driver that runs on everything from windows to FreeBSD and everything in between (linux, osx, esx, aix, solaris).. we have a porting layer (Alexander hates it I know). But it's not a freebsd to linux layer, it's a freebsd (or whatever) to 'internal' layer, (though it started out being the first). Looking at what we have and what you have it seems to me that we could take a subset of the basic CS101 methods that are there and make a linux driver porter's toolkit. things like linux style linked lists vs out queue macros can be addressed easily, but it's just a pain in the neck to do so.. also linux run queues and such are different but making a small set of toolkits to do so wouldn't be a bad idea. some of the more esoteric parts might stay with the ib code however. > > Thanks, > Jeff > >> >> >> -- >> Alexander Kabaev >> > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >