From owner-freebsd-sparc64@FreeBSD.ORG Sun Mar 29 18:15:39 2009 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1750D106568F for ; Sun, 29 Mar 2009 18:15:39 +0000 (UTC) (envelope-from andreast-list@fgznet.ch) Received: from smtp.fgznet.ch (mail.fgznet.ch [81.92.96.47]) by mx1.freebsd.org (Postfix) with ESMTP id 9FBBA8FC15 for ; Sun, 29 Mar 2009 18:15:38 +0000 (UTC) (envelope-from andreast-list@fgznet.ch) Received: from wolfram.andreas.nets ([91.190.8.131]) by smtp.fgznet.ch (8.13.8/8.13.8/Submit_SMTPAUTH) with ESMTP id n2TIFZ2w015421; Sun, 29 Mar 2009 20:15:35 +0200 (CEST) (envelope-from andreast-list@fgznet.ch) Message-ID: <49CFBAC6.5030809@fgznet.ch> Date: Sun, 29 Mar 2009 20:15:34 +0200 From: Andreas Tobler User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: Marius Strobl References: <49CD39B7.3050500@nexus-ag.com> <20090328214138.GA93149@alchemy.franken.de> In-Reply-To: <20090328214138.GA93149@alchemy.franken.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.64 on 81.92.96.47 Cc: freebsd-sparc64@freebsd.org Subject: Re: kernel panic with firewire PCI card X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2009 18:15:39 -0000 Marius Strobl wrote: > On Fri, Mar 27, 2009 at 09:40:23PM +0100, Andreas Tobler wrote: >> Hello, >> >> I get the below panic while having plugged in a firewire PCI card. The >> card itself is not 'sun' compliant. Means, it is a custom fw card made >> for PC's I guess. But as far as I understand, fbsd can handle non sun >> PCI cards, can't it? > > If the driver is well-written, i.e. correctly uses bus_dma(9) > and bus_space(9), is LP64- and endian-clean and deals with > strict alignment requirements, and the chip plays nice > with Sun's interpretation of the PCI specifications they > should work. Drivers that rely on firmware on cards to do > some of the initialization still won't work though if it's > "PC firmware". Typical examples of the latter are ATA and > graphics controllers. In theory there are some ways one > can make them work anyway but that would be quite some > work to do if at all feasible. Aha. Good to know. >> fwohci0: mem >> 0x4008000-0x40087ff,0x400c000-0x400ff >> ff at device 4.0 on pci0 >> fwohci0: [ITHREAD] >> fwohci0: OHCI version 1.0 (ROM=1) >> fwohci0: No. of Isochronous channels is 4. >> fwohci0: EUI64 00:10:74:60:00:00:ee:a9 >> panic: pcib: PCI bus B error AFAR 0x1ff840080ec AFSR 0x4000f00000000000 >> cpuid = 0 >> KDB: enter: panic >> [thread pid 0 tid 100000 ] >> Stopped at kdb_enter+0x80: ta %xcc, 1 >> db> bt >> Tracing pid 0 tid 100000 td 0xc08ad870 >> panic() at panic+0x20c >> psycho_pci_bus() at psycho_pci_bus+0x88 >> intr_event_handle() at intr_event_handle+0x5c >> intr_execute_handlers() at intr_execute_handlers+0x14 >> intr_fast() at intr_fast+0x68 >> -- interrupt level=0xd pil=0 %o7=0xc0659be4 -- >> -- data access error %o7=0x32a -- >> fwphy_rddata() at fwphy_rddata+0xe8 >> fwohci_reset() at fwohci_reset+0x298 >> fwohci_init() at fwohci_init+0x9f8 >> fwohci_pci_attach() at fwohci_pci_attach+0x278 >> device_attach() at device_attach+0x4a4 > > PCI AFSR 0x4000000000000000 indicates that the primary error > was a target abort. Given that no DMA is involved at this stage > this means it actually was the OHCI chip which complained > about the PIO access. If this is the first access after the > reset (check with "l *(0xc0659be4)", "l *(fwphy_rddata+0xe8)" > and "l *(fwohci_reset+0x298)" in gdb on the corresponding > kernel.debug what code is actually involved) I'd suspect > the problem to be a combination of a sloppy driver with a > chip that takes some more time than the other contenders > to get ready again after a reset, i.e. fwohci_reset() only > tries 100 times with waiting one millisecond between tries > for OHCI_HCC_RESET to clear after the reset (the latter part > is in line with the OHCI specification). Increasing to f.e. > 1000 tries should solve the panic then, if this is actually > the cause. Generally fwohci(4) should be changed to fail if > the chip doesn't become ready again after a reset instead > of just ignoring that problem though. At least fwohci_reset() > (there are probably more such functions in fwohci(4)) also > seems to miss some bus space barriers, which also could be > the cause of this panic. I increased the for counter to 1000 in fwohci_reset(). while(OREAD(sc, OHCI_HCCCTL) & OHCI_HCC_RESET) { - if (i++ > 100) break; + if (i++ > 1000) break; DELAY(1000); } This did not help so far. I also tried to check the addresses you mentioned with l *(0xXXXX) But here I miss some things. I guess I need to invoke gdb somehow? Thanks so far, I continue playing :) Andreas Having the firewire debug on (hardcoded) Gives the below. You see that the rest loop terminates: fwohci0: resetting OHCI...done (loop=0) u60# kldload firewire fwohci0: mem 0x4008000-0x40087ff,0x400c000-0x400ff ff at device 4.0 on pci0 fwohci0: latency timer 24 -> 32. fwohci0: cache size 16 -> 16. fwohci0: [ITHREAD] fwohci0: OHCI version 1.0 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:10:74:60:00:00:ee:a9 fwohci0: resetting OHCI...done (loop=0) panic: pcib: PCI bus B error AFAR 0x1ff840080ec AFSR 0x4000f00000000000 cpuid = 0 KDB: stack backtrace: panic() at 0xc03361a8 = panic+0x1c8 psycho_pci_bus() at 0xc0628848 = psycho_pci_bus+0x88 intr_event_handle() at 0xc030be7c = intr_event_handle+0x5c intr_execute_handlers() at 0xc0638fb4 = intr_execute_handlers+0x14 intr_fast() at 0xc0081328 = intr_fast+0x68 -- interrupt level=0xd pil=0 %o7=0xc0372920 -- -- data access error %o7=0xc0895000 -- fwphy_rddata() at 0xc0ef8b60 = fwphy_rddata+0x120 fwohci_reset() at 0xc0efb9d8 = fwohci_reset+0x1d8 fwohci_init() at 0xc0efcb54 = fwohci_init+0x9d4 fwohci_pci_attach() at 0xc0eff0f8 = fwohci_pci_attach+0x278 device_attach() at 0xc0366544 = device_attach+0x4a4 device_probe_and_attach() at 0xc03674a4 = device_probe_and_attach+0x64 pci_driver_added() at 0xc021c054 = pci_driver_added+0x154 devclass_driver_added() at 0xc0363bb4 = devclass_driver_added+0x74 devclass_add_driver() at 0xc03648a8 = devclass_add_driver+0xa8 driver_module_handler() at 0xc0365fd8 = driver_module_handler+0x58 module_register_init() at 0xc032223c = module_register_init+0xdc linker_load_module() at 0xc0317994 = linker_load_module+0xbd4 kern_kldload() at 0xc0317f58 = kern_kldload+0xb8 kldload() at 0xc03180e0 = kldload+0x60 syscall() at 0xc064a2f0 = syscall+0x2f0 -- syscall (304, FreeBSD ELF64, kldload) %o7=0x1008e0 -- userland() at 0x4045b708 user trace: trap %o7=0x1008e0 pc 0x4045b708, sp 0x7fdffffe1e1 pc 0x1006f0, sp 0x7fdffffe2a1 pc 0x40206934, sp 0x7fdffffe361 done KDB: enter: panic [thread pid 1305 tid 100066 ] Stopped at 0xc036dd20 = kdb_enter+0x80: ta %xcc, 1