From owner-freebsd-stable@FreeBSD.ORG Fri May 2 18:06:54 2014 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A766A11B for ; Fri, 2 May 2014 18:06:54 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 7C49818EE for ; Fri, 2 May 2014 18:06:54 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 85977B945; Fri, 2 May 2014 14:06:53 -0400 (EDT) From: John Baldwin To: Don Lewis Subject: Re: Thinkpad R60 hangs when booting recent 8.4-STABLE Date: Fri, 2 May 2014 14:05:32 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.4-CBSD-20140415; KDE/4.5.5; amd64; ; ) References: <201405020939.s429dJVq039239@gw.catspoiler.org> In-Reply-To: <201405020939.s429dJVq039239@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201405021405.32570.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 02 May 2014 14:06:53 -0400 (EDT) Cc: stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 02 May 2014 18:06:54 -0000 On Friday, May 02, 2014 5:39:19 am Don Lewis wrote: > On 1 May, John Baldwin wrote: > > On Thursday, May 01, 2014 1:55:41 am Don Lewis wrote: > >> On 30 Apr, John Baldwin wrote: > >> > >> > Are you up for doing some printf sleuthing? There are two odd things that I > >> > see so far: > >> > > >> > 1) the base address of 0. The question here is if pci_add_map() in > >> > sys/dev/pci/pci.c decides to set start to 0 explicitly, or if it happens > >> > further up the callchain (should be bus_alloc_resource calls in > >> > sys/dev/acpica/acpi_pcib_acpi.c, sys/x86/x86/nexus.c and then in the > >> > rman code itself in sys/kern/subr_rman.c) > >> > > >> > 2) The 'reserved' printfs during boot probe. Those come from a printf in > >> > pci_alloc_resource() in sys/dev/pci/pci.c. However, that should not be called > >> > until a driver attaches to a device and calls bus_alloc_resource(). It should > >> > not be called from pci_add_child() as it seems to be now. > >> > >> The call graph for the four earlier ones that you previously pointed > >> out (not hostb0) is: > >> pci_add_child() > >> pci_add_resources() > >> *_early_takeover() > >> [I suspect] > >> bus_alloc_resource_any() > >> pci_alloc_resource() > >> > >> These are the three system uhci controllers and the system ehci > >> controller, which apparently pass this test: > >> pci_get_class(dev) == PCIC_SERIALBUS && > >> pci_get_subclass(dev) == PCIS_SERIALBUS_USB > > > > Oh, ok. That is fine, and that explains why it was selective in the past (and > > only for I/O resources). That just leaves 1) then. It would be especially good > > to know what pci_add_map() does when it sees this BAR during the bus probe. > > In either case after reading the BAR, base=0xd0000000, > testval=0xf0000008. and the following gets printed: > map[10]: type Prefetchable Memory, range 32, base 0xd0000000, size 28, enabled > Then we call resource_list_add(..., d0000000, dfffffff, 10000000) > > When we don't pass the size in flags (w/o r262226), > resource_list_alloc() eventually calls nexus_alloc_resource() with the > expected start, end, and count values. After returning from > pci_add_map, we write 0xd0000000 to the BAR. > > If we *do* pass the size in flags (with r262226), then > resource_list_alloc() never calls nexus_alloc_resource(), but it does > return a non-NULL value. We then write 0 to the BAR. > > Digging into resource_list_alloc(), I see start, count, and end having > the expected values before the second call to BUS_ALLOC_RESOURCE() > but afterwards rman_get_start(rle->res) is 0 and rman_get_end(rle->res) > is 0xfffffff. We don't call nexus_alloc_resource(). Setting flags to 0 > for this specific call to BUS_ALLOC_RESOURCE() based on the value of > start fixes the problem. > > The culprit appears to be the call to rman_reserve_resource() in > acpi_alloc_resource(). With r262226 it "succeeds", but the returned > resource has the wrong start and end addresses. Without r262226, > rman_reserve_resource returns NULL and we call BUS_ALLOC_RESOURCE(), > which calls nexus_alloc_resource(), which returns the expected range. > > I enabled debug.rman_debug and saw the following: > > rman_reserve_resource_bound: request: > [0xd0000000, 0xdfffffff], length 0x10000000, flags 28736, device (null) > considering [0xfec00000, 0xffffffff] > truncated region: [0, 0xdfffffff]; size 0xe0000000 (requested > 0x10000000) > candidate region [0, 0xdfffffff], size 0xe0000000 > allocating at the end > > This is with subr_rman.c from HEAD. > > This expression will overflow: > if (s->r_start + count - 1 > end) { > I don't understand the point of this test. Is it an optimization to > exit the loop early based on an assumption about the ordering of > elements in the list? Yes. The elements in an rman list are supposed to be sorted. > This expression will also overflow: > rstart = (rstart + amask) & ~amask; > which is where I think the start address is getting set to 0. > > > After reverting subr_rman.c and applying the following patch, I see: > considering [0xfec00000, 0xffffffff] > no unshared regions found > Then nexus_alloc_resource() gets called and I see > considering [0x3ff60000, 0xfebfffff] > and then all the right stuff happens. > > > Index: sys/kern/subr_rman.c > =================================================================== > --- sys/kern/subr_rman.c (revision 262226) > +++ sys/kern/subr_rman.c (working copy) > @@ -468,11 +468,9 @@ > */ > for (s = r; s; s = TAILQ_NEXT(s, r_link)) { > DPRINTF(("considering [%#lx, %#lx]\n", s->r_start, s->r_end)); > - if (s->r_start + count - 1 > end) { > - DPRINTF(("s->r_start (%#lx) + count - 1> end (%#lx)\n", > - s->r_start, end)); > - break; > - } > + if (s->r_end < start + count - 1 || > + s->r_start > end - count + 1) > + continue; > if (s->r_flags & RF_ALLOCATED) { > DPRINTF(("region is allocated\n")); > continue; > > I think this looks good. > I have no idea why 9.2-STABLE was able to boot on this machine ... acpi_alloc_resource() works differently in 9.x and later. It only tries the internal rmans if nexus_alloc_resource() fails. In 8.x it tries the internal rmans first. That change was made in r216674. -- John Baldwin