From owner-freebsd-stable@FreeBSD.ORG  Fri May  2 18:06:54 2014
Return-Path: <owner-freebsd-stable@FreeBSD.ORG>
Delivered-To: stable@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115])
 (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by hub.freebsd.org (Postfix) with ESMTPS id A766A11B
 for <stable@freebsd.org>; Fri,  2 May 2014 18:06:54 +0000 (UTC)
Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1])
 (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits))
 (Client did not present a certificate)
 by mx1.freebsd.org (Postfix) with ESMTPS id 7C49818EE
 for <stable@freebsd.org>; Fri,  2 May 2014 18:06:54 +0000 (UTC)
Received: from jhbbsd.localnet (unknown [209.249.190.124])
 by bigwig.baldwin.cx (Postfix) with ESMTPSA id 85977B945;
 Fri,  2 May 2014 14:06:53 -0400 (EDT)
From: John Baldwin <jhb@freebsd.org>
To: Don Lewis <dl-kvmr@catspoiler.org>
Subject: Re: Thinkpad R60 hangs when booting recent 8.4-STABLE
Date: Fri, 2 May 2014 14:05:32 -0400
User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; )
References: <201405020939.s429dJVq039239@gw.catspoiler.org>
In-Reply-To: <201405020939.s429dJVq039239@gw.catspoiler.org>
MIME-Version: 1.0
Content-Type: Text/Plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Message-Id: <201405021405.32570.jhb@freebsd.org>
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
 (bigwig.baldwin.cx); Fri, 02 May 2014 14:06:53 -0400 (EDT)
Cc: stable@freebsd.org
X-BeenThere: freebsd-stable@freebsd.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Production branch of FreeBSD source code <freebsd-stable.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/options/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-stable/>
List-Post: <mailto:freebsd-stable@freebsd.org>
List-Help: <mailto:freebsd-stable-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-stable>,
 <mailto:freebsd-stable-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 02 May 2014 18:06:54 -0000

On Friday, May 02, 2014 5:39:19 am Don Lewis wrote:
> On  1 May, John Baldwin wrote:
> > On Thursday, May 01, 2014 1:55:41 am Don Lewis wrote:
> >> On 30 Apr, John Baldwin wrote:
> >> 
> >> > Are you up for doing some printf sleuthing?  There are two odd things that I 
> >> > see so far:
> >> > 
> >> > 1) the base address of 0.  The question here is if pci_add_map() in 
> >> > sys/dev/pci/pci.c decides to set start to 0 explicitly, or if it happens 
> >> > further up the callchain (should be bus_alloc_resource calls in 
> >> > sys/dev/acpica/acpi_pcib_acpi.c, sys/x86/x86/nexus.c and then in the
> >> > rman code itself in sys/kern/subr_rman.c)
> >> > 
> >> > 2) The 'reserved' printfs during boot probe.  Those come from a printf in 
> >> > pci_alloc_resource() in sys/dev/pci/pci.c.  However, that should not be called 
> >> > until a driver attaches to a device and calls bus_alloc_resource().  It should 
> >> > not be called from pci_add_child() as it seems to be now.
> >> 
> >> The call graph for the four earlier ones that you previously pointed
> >> out (not hostb0) is:
> >> 	pci_add_child()
> >> 		pci_add_resources()
> >> 			*_early_takeover()
> >> 				[I suspect]
> >> 				bus_alloc_resource_any()
> >> 					pci_alloc_resource()
> >> 					
> >> These are the three system uhci controllers and the system ehci
> >> controller, which apparently pass this test:
> >> 	pci_get_class(dev) == PCIC_SERIALBUS &&
> >> 	pci_get_subclass(dev) == PCIS_SERIALBUS_USB
> > 
> > Oh, ok.  That is fine, and that explains why it was selective in the past (and
> > only for I/O resources).  That just leaves 1) then.  It would be especially good
> > to know what pci_add_map() does when it sees this BAR during the bus probe.
> 
> In either case after reading the BAR, base=0xd0000000,
> testval=0xf0000008. and the following gets printed:
>         map[10]: type Prefetchable Memory, range 32, base 0xd0000000, size 28, enabled
> Then we call resource_list_add(..., d0000000, dfffffff, 10000000)
> 
> When we don't pass the size in flags (w/o r262226),
> resource_list_alloc() eventually calls nexus_alloc_resource() with the
> expected start, end, and count values.  After returning from
> pci_add_map, we write 0xd0000000 to the BAR.
> 
> If we *do* pass the size in flags (with r262226), then
> resource_list_alloc() never calls nexus_alloc_resource(), but it does
> return a non-NULL value.  We then write 0 to the BAR.
> 
> Digging into resource_list_alloc(), I see start, count, and end having
> the expected values before the second call to BUS_ALLOC_RESOURCE()
> but afterwards rman_get_start(rle->res) is 0 and rman_get_end(rle->res)
> is 0xfffffff.  We don't call nexus_alloc_resource(). Setting flags to 0
> for this specific call to BUS_ALLOC_RESOURCE() based on the value of
> start fixes the problem.
> 
> The culprit appears to be the call to rman_reserve_resource() in
> acpi_alloc_resource().  With r262226 it "succeeds", but the returned
> resource has the wrong start and end addresses.  Without r262226,
> rman_reserve_resource returns NULL and we call BUS_ALLOC_RESOURCE(),
> which calls nexus_alloc_resource(), which returns the expected range.
> 
> I enabled debug.rman_debug and saw the following:
> 
> rman_reserve_resource_bound: <ACPI I/O memory addresses> request:
> [0xd0000000, 0xdfffffff], length 0x10000000, flags 28736, device (null)
> considering [0xfec00000, 0xffffffff]
> truncated region: [0, 0xdfffffff]; size 0xe0000000 (requested
> 0x10000000)
> candidate region [0, 0xdfffffff], size 0xe0000000
> allocating at the end
> 
> This is with subr_rman.c from HEAD.
> 
> This expression will overflow:
> 		if (s->r_start + count - 1 > end) {
> I don't understand the point of this test.  Is it an optimization to
> exit the loop early based on an assumption about the ordering of
> elements in the list?

Yes.  The elements in an rman list are supposed to be sorted.

> This expression will also overflow:
> 			rstart = (rstart + amask) & ~amask;
> which is where I think the start address is getting set to 0.
> 
> 
> After reverting subr_rman.c and applying the following patch, I see:
> considering [0xfec00000, 0xffffffff]
> no unshared regions found
> Then nexus_alloc_resource() gets called and I see
> considering [0x3ff60000, 0xfebfffff]
> and then all the right stuff happens.
> 
> 
> Index: sys/kern/subr_rman.c
> ===================================================================
> --- sys/kern/subr_rman.c	(revision 262226)
> +++ sys/kern/subr_rman.c	(working copy)
> @@ -468,11 +468,9 @@
>  	 */
>  	for (s = r; s; s = TAILQ_NEXT(s, r_link)) {
>  		DPRINTF(("considering [%#lx, %#lx]\n", s->r_start, s->r_end));
> -		if (s->r_start + count - 1 > end) {
> -			DPRINTF(("s->r_start (%#lx) + count - 1> end (%#lx)\n",
> -			    s->r_start, end));
> -			break;
> -		}
> +		if (s->r_end < start + count - 1 ||
> +		    s->r_start > end - count + 1)
> +			continue;
>  		if (s->r_flags & RF_ALLOCATED) {
>  			DPRINTF(("region is allocated\n"));
>  			continue;
> 
> 

I think this looks good.

> I have no idea why 9.2-STABLE was able to boot on this machine ...

acpi_alloc_resource() works differently in 9.x and later.  It only
tries the internal rmans if nexus_alloc_resource() fails.  In 8.x it
tries the internal rmans first.  That change was made in r216674.

-- 
John Baldwin