From owner-freebsd-arch@FreeBSD.ORG Tue Jan 4 23:10:23 2011 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 673DC1065670 for ; Tue, 4 Jan 2011 23:10:23 +0000 (UTC) (envelope-from jroberson@jroberson.net) Received: from mail-vw0-f54.google.com (mail-vw0-f54.google.com [209.85.212.54]) by mx1.freebsd.org (Postfix) with ESMTP id 2415E8FC15 for ; Tue, 4 Jan 2011 23:10:22 +0000 (UTC) Received: by vws9 with SMTP id 9so6165547vws.13 for ; Tue, 04 Jan 2011 15:10:22 -0800 (PST) Received: by 10.220.81.5 with SMTP id v5mr6740600vck.74.1294182620899; Tue, 04 Jan 2011 15:10:20 -0800 (PST) Received: from [10.0.1.198] ([72.253.42.56]) by mx.google.com with ESMTPS id u4sm4725501vch.36.2011.01.04.15.10.17 (version=SSLv3 cipher=RC4-MD5); Tue, 04 Jan 2011 15:10:19 -0800 (PST) Date: Tue, 4 Jan 2011 13:13:00 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: Julian Elischer In-Reply-To: <4D23A2CF.1010904@freebsd.org> Message-ID: References: <20110103220153.69cf59e0@kan.dnsalias.net> <20110104082252.45bb5e7f@kan.dnsalias.net> <4D23A2CF.1010904@freebsd.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: Linux kernel compatability X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jan 2011 23:10:23 -0000 On Tue, 4 Jan 2011, Julian Elischer wrote: > On 1/4/11 12:53 PM, Jeff Roberson wrote: >> On Tue, 4 Jan 2011, Alexander Kabaev wrote: >> >>> On Mon, 3 Jan 2011 19:03:01 -1000 (HST) >>> Jeff Roberson wrote: >>> >>>> On Mon, 3 Jan 2011, Alexander Kabaev wrote: >>>> >>>>> On Mon, 3 Jan 2011 10:31:24 -1000 (HST) >>>>> Jeff Roberson wrote: >>>>> >>>>>> Hello Folks, >>>>>> >>>>>> Some of you may have seen my infiniband work proceed in svn. It is >>>>>> coming to a close soon and I will be integrating it into current. >>>>>> I have a few patches to the kernel to send for review but I wanted >>>>>> to bring up the KPI wrapper itself for discussion. >>>>>> >>>>>> The infiniband port has been done by creating a 10,000 line KPI >>>>>> compatability layer. With this layer the vast majority of the >>>>>> driver code runs unmodified. The exceptions are anything that >>>>>> interfaces with skbs and most of the code that deals with network >>>>>> interfaces. >>>>>> >>>>>> Some examples of things supported by the wrapper: >>>>>> >>>>>> atomics, types, bitops, byte order conversion, character devices, >>>>>> pci devices, dma, non-device files, idr tables, interrupts, >>>>>> ioremap, hashes, kobjects, radix trees, lists, modules, notifier >>>>>> blocks, rbtrees, rwlock, rwsem, semaphore, schedule, spinlocks, >>>>>> kalloc, wait queues, workqueues, timers, etc. >>>>>> >>>>>> Obviously a complete wrapper is impossible and I only implemented >>>>>> the features that I needed. The build is accomplished by pointing >>>>>> the linux compatible code at sys/ofed/include/ which has a >>>>>> simulated linux kernel include tree. There are some config(8) >>>>>> changes to help this along as well. >>>>>> >>>>>> I have seen that some attempt at similar wrappers has been made >>>>>> elsewhere. I wonder if instead of making each one tailored to >>>>>> individual components, which mostly seem to be filesystems so far, >>>>>> should we put this in a central place under compat somewhere? Is >>>>>> this project doomed to be tied to a single consumer by the specific >>>>>> nature of it? >>>>>> >>>>>> Other comments or concerns? >>>>>> >>>>>> Thanks, >>>>>> Jeff >>>>> >>>>> >>>>> This probably will go against popular opinion here, but having 10k >>>>> linux emulation layer that _almost_ work in the tree will be an >>>>> unfortunate event and will do more damage to FreeBSD as a platform >>>>> than good in the long run. I would rather see this code never hit >>>>> main repository. >>>> >>>> I would argue that the layer works very well for infiniband. Much >>>> better than almost. It is only almost complete in that there is no >>>> need for me to implement features that we're not using. >>>> >>>> I am interested in hearing your other concerns however. >>>> >>>> Thanks, >>>> Jeff >>>> >>> >> >> Alexander, let me first start out by saying I have a great deal of respect >> for you and I hear your concerns. I see that this is a somewhat heated >> issue and I can really only address the technical points. The more >> existential questions about FreeBSD will have to be left to others. >> >>> The considerations are simple enough. First, we do not have many IB >>> users of FreeBSD in the wild and those that we have (Isilon) seem to be >>> perfectly capable of managing the IB stack out of the tree, without >>> dumping the thousands of lines of the code into the base. If they had >>> the stack before, but were not willing/capable to provide adequate care >>> for it in the past, there is no reason to expect things to change with >>> second stack, which now will rot in our tree instead of theirs. >> >> They provided adequate care for it to keep their product running on old >> versions of FreeBSD. Unfortunately it is a large stack and there are a >> great number of people and organizations working on improving and advancing >> it on Linux via OFED and having a private stack does not give you the >> benefit of their work. The motivation for making the wrapper layer was >> entirely to keep pace with this development and make it less likely that >> what is in the tree will rot. >> >>> >>> Second, semi-complete Linux compat layer in kernel will have the >>> same effect as linuxulator in userland - we do have some vendors still >>> trying to bother with FreeBSD drivers for their hardware now and we >>> will have none after we provide the possibility to hack their Linux >>> code to run somewhat stably on top of Linux compat layer. Due to >>> intentional fluidity of Linux kAPI, our shims will never quite walk and >>> quack like their original implementation in Linux kernel and combined >>> result will always be lees stable than native Linux linux drivers in >>> Linux kernel. >> >> I have heard this argument about the linuxulator and what we're really >> talking about is slipping FreeBSD marketshare. I don't share the view that >> the linuxulator futhered this slip but rather my view is that it allows us >> to stay relevant in areas where companies can not justify an independent >> FreeBSD effort. Adobe is a good example of this. >> >> Let's talk nuts and bolts about what this thing does. In the vast majority >> of cases it simply shuffles arguments and function names around where there >> is a 1:1 correlation between linux api and FreeBSD API. Think about things >> like atomics, callouts, locks, jiffies vs ticks, etc. In these areas the >> systems are trivially different. In a very small number of areas where >> this wasn't the case I did a direct port and noted it with an #ifdef. >> >> This works specifically in the infiniband case because it is its own middle >> layer. You can't write a scsi driver for linux and use it on BSD with >> this. You can't write a network driver even. But if you do bring in code >> from linux you don't have to worry about changing every kmalloc to malloc >> and every printk to printf so diffs can be reduced in trivial cases. I >> thought given your work on XFS for FreeBSD that would make sense to you. >> >> Our options are, to leave FreeBSD users without infiniband, which I can >> tell you has cost us more market share as I know of specific cases we have >> lost due to it. To maintain our own stack independently, which no one has >> the budget for. Or to try to integrate with OFED. Do you see some other >> approach? > As you may know Alexander and I both work for a company that produces a > "large" driver that runs on everything from windows to FreeBSD and everything > in between (linux, osx, esx, aix, solaris).. we have a porting layer > (Alexander hates it I know). > But it's not a freebsd to linux layer, it's a freebsd (or whatever) to > 'internal' layer, > (though it started out being the first). > Looking at what we have and what you have it seems to me that we could take a > subset of the basic CS101 > methods that are there and make a linux driver porter's toolkit. Yes the problem in this case is that OFED controls the code and they specifically removed a portability layer. So the portability layer is the Linux APIs they are currently using. Really in some sense it ends up being the same thing. > > things like linux style linked lists vs out queue macros can be > addressed easily, but it's just a pain in the neck to do so.. > also linux run queues and such are different but making a small > set of toolkits to do so wouldn't be a bad idea. > some of the more esoteric parts might stay with the ib code however. After this discussion I'm leaning towards leaving the layer I have in the ofed/ directory and leaving it tied to the version of ofed we currently have imported. They actually have a set of scripts to 'backport' their stack to different linux versions if we were to standardize on some older linux kernel release but I don't think it's worth the effort. I understand wanting to limit the spread of hybridized linux kernel code. It is not my first choice but comparing it with the alternative of not having some desired feature I will choose the feature. Thanks, Jeff > > > >> >> Thanks, >> Jeff >> >>> >>> >>> -- >>> Alexander Kabaev >>> >> _______________________________________________ >> freebsd-arch@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-arch >> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" >> >