From owner-freebsd-sparc64@FreeBSD.ORG Wed Apr 11 11:00:24 2012 Return-Path: Delivered-To: freebsd-sparc64@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E0191065670 for ; Wed, 11 Apr 2012 11:00:24 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6EDD18FC08 for ; Wed, 11 Apr 2012 11:00:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q3BB0OEg004854 for ; Wed, 11 Apr 2012 11:00:24 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q3BB0OLq004853; Wed, 11 Apr 2012 11:00:24 GMT (envelope-from gnats) Date: Wed, 11 Apr 2012 11:00:24 GMT Message-Id: <201204111100.q3BB0OLq004853@freefall.freebsd.org> To: freebsd-sparc64@FreeBSD.org From: Manuel Tobias Schiller Cc: Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64) X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Manuel Tobias Schiller List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Apr 2012 11:00:24 -0000 The following reply was made to PR sparc64/141918; it has been noted by GNATS. From: Manuel Tobias Schiller To: Marius Strobl Cc: bug-followup@FreeBSD.org Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64) Date: Wed, 11 Apr 2012 12:59:54 +0200 --Sig_/XlYRt5lhdg=utM_InDxz.iJ Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Fri, 6 Apr 2012 20:37:26 +0200 Marius Strobl wrote: > On Fri, Apr 06, 2012 at 09:58:42AM +0200, Manuel Tobias Schiller wrote: > > On Thu, 5 Apr 2012 18:21:24 +0200 > > Manuel Tobias Schiller wrote: > >=20 > > > On Wed, 4 Apr 2012 14:59:46 +0200 > > > Marius Strobl wrote: > > >=20 > > > > Hrm, okay, would be interesting to know what the machine actually > > > > does. Looking at the code I found another bug; the VIA-workaround > > > > currently doesn't do anything: > > > > http://people.freebsd.org/~marius/ehci_pci_fix_via_quirk.diff > > > > This might apply for the insane I/O you've reported but I'm unsure > > > > whether it makes a difference for the HSE interrupt. > > > >=20 > > > > Marius > > >=20 > > > From the looks of it (with your patch at > > > http://people.freebsd.org/~marius/usb_busdma.diff), the machine > > > starts booting, then tries to mount the filesystems residing on the > > > USB disks, apparently does some I/O (while still processing > > > interrupts), and after less than a minute locks up solid without > > > any indication on the serial console as to what went wrong... > > >=20 > > > I've started another build with your "VIA quirk fix" but without the > > > patch in the last paragraph (the machine locking up is a lot worse > > > than just USB not working after some heavy I/O, so I left it out > > > for now), but since I started the build without being properly > > > awake this morning, I typed "make buildworld" where I wanted to > > > type "make buildkernel", so it's going to take some time. Also, > > > I'll be leaving CERN over easter, so I won't be running tests on > > > that machine from tomorrow morning until Monday evening (I can > > > compile kernels, though). Anyhow, I'll let you know what comes out. > > >=20 > > > Cheers, thanks a lot for your effort, and, of course, a Happy > > > Easter! > > >=20 > > > Manuel > >=20 > > Hi, > >=20 > > the "VIA quirk fix" on its own gives the familiar message in dmesg > > (unrecoverable error, controller halted), so I'm compiling a kernel > > which >=20 > Oof, this likely means there's a more basic problem with this device. > Have you already tried to re-seat the card in case there's an electrical > problem? > Please also provide the output of `pciconf -rb ehci0@pci0:2:5:2 0:255' > from a booting kernel. > FYI, after some digging I've found the following card > ehci0@pci0:2:5:2: class=3D0x0c0320 card=3D0x31041106 chip=3D0x31041106 > rev=3D0x6h0 which is a newer revision of your device and works just fine > in a T1-200 including with the usb(4) fixes. The publicly available > datasheets for the VIA USB controllers are minimal and exclude errata > and Linux also doesn't seem to use any additional work arounds, so I'm > starting to run out of ideas what could be wrong with your revision. > The only remaining thing to give a try I currently can think of is to > test whether it chokes on the generic initialization done by the > sparc64 PCI code using the attached patch. >=20 > > combines this fix with your latest busdma fix to try them both > > together; >=20 > This combination is unlikely to make a difference. >=20 > Marius >=20 Hi Marius, I've tried your new patch, both on its own and in conjunction with the=20 latest busdma and Via quirk fixes, and I still get the same error message... Here's the output of pciconf you requested: mala@router:~> sudo pciconf -rb ehci0@pci0:2:5:2 0:255 Password: 06 11 04 31 06 00 10 22 65 20 03 0c 00 16 80 00=20 00 a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 06 11 04 31=20 00 00 00 00 80 00 00 00 00 00 00 00 14 03 00 00=20 00 00 0b 00 00 00 00 00 a0 20 00 29 00 00 ff ff=20 00 5a 04 80 00 00 00 00 04 0b 88 88 33 00 00 00=20 20 20 01 00 00 00 00 00 01 00 00 00 00 00 00 c0=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 01 00 0a 7e 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00=20 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00 00 This was taken after the controller stopped, on a kernel with your latest patch, but I'd guess that doesn't matter - the EHCI driver should not be playing with the PCI settings after initialisation... I've also opened the machine, and the PCI card is seated properly. I even removed it and tried an even older VIA EHCI controller and one of the first USB 2.0 controllers by NEC - no luck, the VIA one had trouble recognizing devices, the NEC one did not recognize a single one I plugged in. Is there anything else I can try? Manuel --=20 Homepage: http://www.hinterbergen.de/mala OpenPGP: 0xA330353E (DSA) or 0xD87D188C (RSA) --Sig_/XlYRt5lhdg=utM_InDxz.iJ Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQGcBAEBAgAGBQJPhWQqAAoJEEPbVOqHHK4gSkUMAMsWk+BfAU0ZoMpA63qmRZb3 fSFnlDeH2F0NjQDeN4bvJE4ovXCOmU+znrntc/GJJ4VmMOuysUc8fAdDx98LFGZo Sq/7IuT1H3d6RQ2vfSMneOS2kH5Ph/QthC1qXfTsNkIEpShB0mNoXasOq5F3FMah KnLGNoeL2Is9JUGtg6714kZsfyvyY+5ZSWLwsf/paWtxSUaWEevq3kq4OpTEmhL/ OIjjsW9g5JAfrJZxxrEO0eZ+pOGGOETANbafj16fM1KO4ZXRNB3xY8KliBW+S0RM XfPMpMjINZXVTIDWn7DdxBWcuC9y+AjaTA1AOOvH4vF6G4fGWmJPOmnXB4OpWNoz jST97FCVopxg9XT9Is9OjdvVSKX+OGXspZZ5AynmbvfaROmyAiO8M0JfVM3niXQh R/192SHyVAF/pWnorPD3zUIQ7jtWXmhro+V0SAxnco850a9q8pznOh25Rv3oEYWD mDKwLquRKYVOgmS1/ssYBd6dOHnW3mxt6cdn/PI0sA== =6lMF -----END PGP SIGNATURE----- --Sig_/XlYRt5lhdg=utM_InDxz.iJ--