Date: Sat, 28 Mar 2009 22:41:38 +0100 From: Marius Strobl <marius@alchemy.franken.de> To: Andreas Tobler <andreas.tobler@nexus-ag.com> Cc: freebsd-sparc64@freebsd.org Subject: Re: kernel panic with firewire PCI card Message-ID: <20090328214138.GA93149@alchemy.franken.de> In-Reply-To: <49CD39B7.3050500@nexus-ag.com> References: <49CD39B7.3050500@nexus-ag.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Mar 27, 2009 at 09:40:23PM +0100, Andreas Tobler wrote: > Hello, > > I get the below panic while having plugged in a firewire PCI card. The > card itself is not 'sun' compliant. Means, it is a custom fw card made > for PC's I guess. But as far as I understand, fbsd can handle non sun > PCI cards, can't it? If the driver is well-written, i.e. correctly uses bus_dma(9) and bus_space(9), is LP64- and endian-clean and deals with strict alignment requirements, and the chip plays nice with Sun's interpretation of the PCI specifications they should work. Drivers that rely on firmware on cards to do some of the initialization still won't work though if it's "PC firmware". Typical examples of the latter are ATA and graphics controllers. In theory there are some ways one can make them work anyway but that would be quite some work to do if at all feasible. > > The os itself is current as of yesterday, the kernel rev you see below. > > Is there anything more I can provide to debug this issue? > > Btw, the machine is a ultra60. > <...> > fwohci0: <Texas Instruments TSB12LV23> mem > 0x4008000-0x40087ff,0x400c000-0x400ff > ff at device 4.0 on pci0 > fwohci0: [ITHREAD] > fwohci0: OHCI version 1.0 (ROM=1) > fwohci0: No. of Isochronous channels is 4. > fwohci0: EUI64 00:10:74:60:00:00:ee:a9 > panic: pcib: PCI bus B error AFAR 0x1ff840080ec AFSR 0x4000f00000000000 > cpuid = 0 > KDB: enter: panic > [thread pid 0 tid 100000 ] > Stopped at kdb_enter+0x80: ta %xcc, 1 > db> bt > Tracing pid 0 tid 100000 td 0xc08ad870 > panic() at panic+0x20c > psycho_pci_bus() at psycho_pci_bus+0x88 > intr_event_handle() at intr_event_handle+0x5c > intr_execute_handlers() at intr_execute_handlers+0x14 > intr_fast() at intr_fast+0x68 > -- interrupt level=0xd pil=0 %o7=0xc0659be4 -- > -- data access error %o7=0x32a -- > fwphy_rddata() at fwphy_rddata+0xe8 > fwohci_reset() at fwohci_reset+0x298 > fwohci_init() at fwohci_init+0x9f8 > fwohci_pci_attach() at fwohci_pci_attach+0x278 > device_attach() at device_attach+0x4a4 PCI AFSR 0x4000000000000000 indicates that the primary error was a target abort. Given that no DMA is involved at this stage this means it actually was the OHCI chip which complained about the PIO access. If this is the first access after the reset (check with "l *(0xc0659be4)", "l *(fwphy_rddata+0xe8)" and "l *(fwohci_reset+0x298)" in gdb on the corresponding kernel.debug what code is actually involved) I'd suspect the problem to be a combination of a sloppy driver with a chip that takes some more time than the other contenders to get ready again after a reset, i.e. fwohci_reset() only tries 100 times with waiting one millisecond between tries for OHCI_HCC_RESET to clear after the reset (the latter part is in line with the OHCI specification). Increasing to f.e. 1000 tries should solve the panic then, if this is actually the cause. Generally fwohci(4) should be changed to fail if the chip doesn't become ready again after a reset instead of just ignoring that problem though. At least fwohci_reset() (there are probably more such functions in fwohci(4)) also seems to miss some bus space barriers, which also could be the cause of this panic. Marius
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090328214138.GA93149>