From owner-freebsd-hackers Sat Feb 16 1:59:20 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from hawk.mail.pas.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22]) by hub.freebsd.org (Postfix) with ESMTP id 8812E37B402 for ; Sat, 16 Feb 2002 01:59:12 -0800 (PST) Received: from pool0004.cvx21-bradley.dialup.earthlink.net ([209.179.192.4] helo=mindspring.com) by hawk.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16c1cZ-0000ZA-00; Sat, 16 Feb 2002 01:59:08 -0800 Message-ID: <3C6E2D61.66B3C362@mindspring.com> Date: Sat, 16 Feb 2002 01:58:57 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Andy Sporner Cc: freebsd-hackers@freebsd.org Subject: Re: Porting a device driver from NetBSD to FreeBSD References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Andy Sporner wrote: [...] Here is the sum total of my clue-fu on this problem; it is mostly supposition, because of incomplete information. Bill Pauls kung-fu in ethernet drivers is much greater than anyone else's... it beats the heck out of my cowering piglet style. ;-). The best advice *anyone* could give you is to "ask Bill Paul". > Can you give me any hints on > the device driver question I posted a few days ago. There was a > response, however I don't see how it applies for these reasons. > > 1. When the hardware (board) is inserted, but no kernel driver > there are no failures. In other words, your driver problem is in your driver. 8-). > 2. When the hardware is installed with the minimal kernel driver > the system locks. The minimal kernel driver only attaches some > resources. This appears to be a network driver. There are several possible complications. The first is the lack of an interrupt handler that just discards the events; this might be because you showed us an incomplete driver. The second is that NetBSD could be doing something in its bus management code that FreeBSD isn't, and without that, there's a problem ("switch_intr" doesn't look like the right thing to call, to me). What happens if you just probe the thing, and don't try to attach or detach at all (always fail the probe, but printf when it would have been successful)? You implied that you did this, but it wasn't entirely clear that you had not attached it. Really, merely attaching a device should do nothing. If it's a PCCARD device, and it just looks like a PCI device because you haven't included all the information, it could easily be the pccard code. You also didn't say on which version of FreeBSD you are doing this (posting to -hackers really doesn't identify the version very well 8-)). > 3. When doing the full initialization of the device (which works > in NetBSD) there are also the SAME failures as doing no > initialization at all of the hardware (as seen in the samples posted). I recommend looking at the if_tx.c driver, and using that as a guide, since it does some of the strange stuff you seem to need to do, and it (apparently) works. It might just be that you are setting PCI_COMMAND_MASTER_ENABLE, rather than setting: command |= PCI_COMMAND_IO_ENABLE | PCI_COMMAND_MEM_ENABLE | PCI_COMMAND_MASTER_ENABLE; ;-). Be careful that the if_tx.c driver is pretty dumb about this, with the "EPIC_USEIOSPACE" manifest constant: if it would simply pick the preferred one first, and only fatally fail in the the case that both failed, then it would be much more robust, and able to work in many more circumstances, without needing a kernel recompilation to try the other approach. You say that you have a version that doesn't call "init_nitro"? If not, then it could easily be crapping on memory or I/O space that it has not allocated for its use. > 4. The device driver does not use MBUFS at all. Not relevent, then... though if it's a network driver, it's obligated to use mbufs at some point. As a final "look for zebras", I'll note that perhaps the problem is not in your driver at all, and that the problem is actually in another driver. The way this could work is that if the device shared PCI interrupts with a rather bogus driver, then you could be locking in the bogus driver as a result of having given it an interrupt from the galnet at a time that it was not able to properly field the interrupt without failure (e.g. perhaps the interrupt notification is non-atomic). To avoid this during driver developement, you should always make sure that your dmesg shows your experimental driver on its own interrupt. If it doesn't, you should probably juggle your cards until the interrupt is not shared, or even consider simply disabling the driver that is sharing the interrupt, if it is not an important device to the developement process, and you don't want to go card-juggling. Actually, I don't know how PCI interrupt sharing is handled in NetBSD; I would be surprised if they had not done the extra work to make PCI interrupts non-shared, so long as there was a free one available for the task; FreeBSD does not really do this (reprogramming INT A/B/C/D usage based on free interrupts, or based on preferences for Bridges being on their own INT pin vs. local devices sharing, etc.). In fact... your opriginal post said: ] I have been trying to port a driver I had written on NetBSD to FreeBSD. ] On NetBSD the driver functions without incident, On FreeBSD, after a time ] the whole system locks up. ] ] I can create this by doing an FTP over the network interface (or sometimes ] heavy disk activity). ] ] I hope somebody can give me a hint of where I should look. It seems that ] the PCI performance on FreeBSD is much faster in talking to this particular ] devic. This implies to me that: o It's a network driver o It works for a while and breaks o The FreeBSD operation is unexpectedly fast Together, this indicates that if you have the driver running and it locks up the system later, that you might be sharing an interrupt with your card, and that it might be your own interrupt routine which is treating someone else's interrupt on a shared interrupt as if it's your own, and breaking on that count. The "much faster" sort of implies that the other interrupt is causing your driver to poll your device, so the "FreeBSD is faster" effect you are seeing is just an illusion caused by the bad code. Alternately, you could ask Bill Paul, since he's a better choice than me on this sort of thing, anyway. 8-) 8-). Hope this is useful, even if it doesn't come right out and say "here's a patch". -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message