From owner-freebsd-sparc64@FreeBSD.ORG  Tue Apr  3 15:10:04 2012
Return-Path: <owner-freebsd-sparc64@FreeBSD.ORG>
Delivered-To: freebsd-sparc64@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 67384106564A
	for <freebsd-sparc64@hub.freebsd.org>;
	Tue,  3 Apr 2012 15:10:04 +0000 (UTC)
	(envelope-from gnats@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 420068FC0C
	for <freebsd-sparc64@hub.freebsd.org>;
	Tue,  3 Apr 2012 15:10:04 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.5/8.14.5) with ESMTP id q33FA45c040960
	for <freebsd-sparc64@freefall.freebsd.org>; Tue, 3 Apr 2012 15:10:04 GMT
	(envelope-from gnats@freefall.freebsd.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.5/8.14.5/Submit) id q33FA4ro040959;
	Tue, 3 Apr 2012 15:10:04 GMT (envelope-from gnats)
Date: Tue, 3 Apr 2012 15:10:04 GMT
Message-Id: <201204031510.q33FA4ro040959@freefall.freebsd.org>
To: freebsd-sparc64@FreeBSD.org
From: Marius Strobl <marius@alchemy.franken.de>
Cc: 
Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error,
	controller halted (sparc64)
X-BeenThere: freebsd-sparc64@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Marius Strobl <marius@alchemy.franken.de>
List-Id: Porting FreeBSD to the Sparc <freebsd-sparc64.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64>, 
	<mailto:freebsd-sparc64-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-sparc64>
List-Post: <mailto:freebsd-sparc64@freebsd.org>
List-Help: <mailto:freebsd-sparc64-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64>,
	<mailto:freebsd-sparc64-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Apr 2012 15:10:04 -0000

The following reply was made to PR sparc64/141918; it has been noted by GNATS.

From: Marius Strobl <marius@alchemy.franken.de>
To: Manuel Tobias Schiller <mala@hinterbergen.de>
Cc: bug-followup@FreeBSD.org
Subject: Re: sparc64/141918: [ehci] ehci_interrupt: unrecoverable error, controller halted (sparc64)
Date: Tue, 3 Apr 2012 17:00:43 +0200

 On Tue, Apr 03, 2012 at 10:37:14AM +0200, Manuel Tobias Schiller wrote:
 > On Mon, 2 Apr 2012 10:43:14 +0200
 > Manuel Tobias Schiller <mala@hinterbergen.de> wrote:
 > 
 > > On Mon, 2 Apr 2012 01:00:56 +0200
 > > Manuel Tobias Schiller <mala@hinterbergen.de> wrote:
 > > 
 > > > On Sun, 1 Apr 2012 12:41:24 +0200
 > > > Marius Strobl <marius@alchemy.franken.de> wrote:
 > > > 
 > > > > Well, the individual patches shouldn't make things worse except for
 > > > > the second one causing more memory to be used so I'd suggest to
 > > > > combine them. If in the end things actually work we still can check
 > > > > what changes are needed for that.
 > > > > Looking at the Linux USB code, the FreeBSD one doesn't some to honor
 > > > > some DMA constraints and at least for the alignment it's actually
 > > > > hard to follow what value eventually is used. One thing that stands
 > > > > out is that for EHCI, the boundary is 4096. This is most easily
 > > > > fixed by defining USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h.
 > > > > 
 > > > > Marius
 > > > 
 > > > Ok, the second patch on its own doesn't appear to work either, so I'm
 > > > trying the combination of patches now. By the way: defining
 > > > USB_PAGE_SIZE to 4096 in sys/dev/usb/usb_busdma.h is a bad idea - the
 > > > kernel panics with a backtrace pointing into the mmu-related code.
 > > > Probably has to do with sparc64 mmu only supporting 8k pages, so I'm
 > > > not terribly surprised... Ok, I'm waiting for the next make
 > > > buildkernel to finish, and I'll let you know what comes out.
 > > > 
 > > > Manuel
 > > 
 > > Ok, I also tested a kernel with both patches, and the issue persists. Do
 > > you have something else to try?
 > > 
 > > Manuel
 > >
 > 
 > Hi Marius,
 > 
 > I did a bit of code reading (/usr/src/sys/dev/usb/controller/ehci.c near
 > line 1494), and I realised that the "unrecoverable error" message should
 > only be triggered if the EHCI status register has the EHCI_STS_HCH bit
 > set - according to the status word dump in my log, it is not set (just
 > after the "unrecoverable error" message). The register dump re-reads the
 > status register from the hardware. Could it be that some controllers have
 > a glitch or something on that particular bit, and we better re-read the
 > status register before we conclude that the controller "really wanted to
 > set that bit"?
 
 You mean EHCI_STS_HSE? This is expected, ehci_interrupt() clears the
 pending interrupt status bits before dumping the register content:
 EOWRITE4(sc, EHCI_USBSTS, status);      /* acknowledge */
 
 > I can also see that the bit is set in the original bug report. I don't
 > know if that machine is just faster (and the bit has not had the time to
 > clear yet), or if we're talking about two different problems here...
 
 Probably, the other controller just sets it again after the bit is
 cleared.
 
 Marius