Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 May 2012 16:30:19 +0100
From:      Anton Shterenlikht <mexas@bristol.ac.uk>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-current@freebsd.org, Anton Shterenlikht <mexas@bristol.ac.uk>
Subject:   Re: updating from r231158 to 234465: mounting from ufs:/dev/ad4s1a failed with error 19
Message-ID:  <20120516153019.GB9070@mech-cluster241.men.bris.ac.uk>
In-Reply-To: <201205161108.05809.jhb@freebsd.org>
References:  <20120426224215.GA79891@mech-cluster241.men.bris.ac.uk> <201205151211.09161.jhb@freebsd.org> <20120516084551.GA49037@mech-cluster241.men.bris.ac.uk> <201205161108.05809.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, May 16, 2012 at 11:08:05AM -0400, John Baldwin wrote:
> On Wednesday, May 16, 2012 4:45:51 am Anton Shterenlikht wrote:
> > On Tue, May 15, 2012 at 12:11:09PM -0400, John Baldwin wrote:
> > > On Monday, May 07, 2012 5:25:02 pm Anton Shterenlikht wrote:
> > > > On Mon, May 07, 2012 at 10:39:51AM -0400, John Baldwin wrote:
> > > > > On Friday, May 04, 2012 4:07:24 pm Anton Shterenlikht wrote:
> > > > > > On Fri, May 04, 2012 at 11:07:59AM -0400, John Baldwin wrote:
> > > > > > > On Friday, May 04, 2012 7:51:33 am Anton Shterenlikht wrote:
> > > > > > > > On Thu, May 03, 2012 at 02:46:18PM -0400, John Baldwin wrote:
> > > > > > > > > On Thursday, May 03, 2012 11:35:19 am Anton Shterenlikht wrote:
> > > > > > > > > > On Tue, May 01, 2012 at 12:35:26PM +0100, Anton Shterenlikht wrote:
> > > > > > > > > > > On Mon, Apr 30, 2012 at 08:43:14AM -0400, John Baldwin wrote:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I also see:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0exb
> > > > > > > > > > > > > ata0: stat1=0x00 err=0x00 lsb=0x00 msb=0x00
> > > > > > > > > > > > > ata0: reset tp2 stat0=00 stat1=00 devices=0x10000
> > > > > > > > > > > > 
> > > > > > > > > > > > Hmmm, I don't know how to grok these lines, but does your disk work 
> > > > > > > at 
> > > > > > > > > all now 
> > > > > > > > > > > > with any kernel?  It may be that your disk has died (or a cable, 
> > > > > > > etc.) 
> > > > > > > > > and it
> > > > > > > > > > > > just happened to coincide with your upgrade?
> > > > > > > > > > > 
> > > > > > > > > > > I reverted back to r231158, built world and generic
> > > > > > > > > > > kernel (minus all modules, i.e. "option MODULES_OVERRIDE=").
> > > > > > > > > > > This works, see the verbose boot dmesg at the end.
> > > > > > > > > > > 
> > > > > > > > > > > I think I'll just do a binary search.
> > > > > > > > > > 
> > > > > > > > > > I traced it to r233677.
> > > > > > > > > > The only change from 233676 to 233677 is
> > > > > > > > > > in /sys/dev/pci/pci.c
> > > > > > > > > > 
> > > > > > > > > > My kernel is GENERIC with no modules
> > > > > > > > > > and with various bits removed, e.g. all raid devices
> > > > > > > > > > and PCI network devices, which I definitely
> > > > > > > > > > haven't got on this laptop.
> > > > > > > > > > 
> > > > > > > > > > Below is the verbose boot with r233676.
> > > > > > > > > > Apparently at the beginning there's also
> > > > > > > > > > the previous unsuccessful boot with r233677.
> > > > > > > > > > Is this a new feature? I didn't know the
> > > > > > > > > > previous dmesg is preserved after a reboot.
> > > > > > > > > > Anyway, you can see clearly the error with r233677.
> > > > > > > > > > 
> > > > > > > > > > I guess this is something to do with
> > > > > > > > > > ata -> ada change?
> > > > > > > > > 
> > > > > > > > > I don't think so.
> > > > > > > > > 
> > > > > > > > > Please try just this change:
> > > > > > > > > 
> > > > > > > > > Index: pci.c
> > > > > > > > > ===================================================================
> > > > > > > > > --- pci.c	(revision 234928)
> > > > > > > > > +++ pci.c	(working copy)
> > > > > > > > > @@ -2822,10 +2822,14 @@ pci_add_map(device_t bus, device_t dev, int reg, s
> > > > > > > > >  		 * from the parent.
> > > > > > > > >  		 */
> > > > > > > > >  		resource_list_delete(rl, type, reg);
> > > > > > > > > -	} else {
> > > > > > > > > +		start = 0;
> > > > > > > > > +		device_printf(bus,
> > > > > > > > > +		    "pci%d:%d:%d:%d bar %#x failed to allocate",
> > > > > > > > > +		    pci_get_domain(dev), pci_get_bus(dev), pci_get_slot(dev),
> > > > > > > > > +		    pci_get_function(dev), reg);
> > > > > > > > > +	} else
> > > > > > > > >  		start = rman_get_start(res);
> > > > > > > > > -		pci_write_bar(dev, pm, start);
> > > > > > > > > -	}
> > > > > > > > > +	pci_write_bar(dev, pm, start);
> > > > > > > > >  	return (barlen);
> > > > > > > > >  }
> > > > > > > > >  
> > > > > > > > 
> > > > > > > > That helped, thank you.
> > > > > > > 
> > > > > > > Bizarre, can you get a regular dmesg with that change applied?
> > > > > > 
> > > > > 
> > > > > Hmm, I missed a newline at the end. :)  Looks like this happened twice.
> > > > > I've added the relevant verbose boot messages from your earlier kernel
> > > > > below each one:
> > > > > 
> > > > > > pci0: <ACPI PCI bus> on pcib0
> > > > > > pci0: pci0:0:20:2 bar 0x10 failed to allocate
> > > > > > pcib1: <ACPI PCI-PCI bridge> at device 1.0 on pci0
> > > > > 
> > > > > found-> vendor=0x1002, dev=0x4383, revid=0x00
> > > > >         domain=0, bus=0, slot=20, func=2
> > > > >         class=04-03-00, hdrtype=0x00, mfdev=0
> > > > >         cmdreg=0x0006, statreg=0x0410, cachelnsz=16 (dwords)
> > > > >         lattimer=0x40 (1920 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
> > > > >         intpin=a, irq=10
> > > > >         powerspec 2  supports D0 D3  current D0
> > > > >         map[10]: type Memory, range 64, base 0xcc408000, size 14, enabled
> > > > > pcib0: matched entry for 0.20.INTA
> > > > > pcib0: slot 20 INTA hardwired to IRQ 16
> > > > > 
> > > > > > pcib4: <ACPI PCI-PCI bridge> at device 20.4 on pci0
> > > > > > pcib4: failed to allocate initial memory window: 0xcc100000-0xcc1fffff
> > > > > > pci2: <ACPI PCI bus> on pcib4
> > > > > > pci2: pci0:2:4:0 bar 0x10 failed to allocate
> > > > > > cbb0: <RF5C476 PCI-CardBus Bridge> irq 20 at device 4.0 on pci2
> > > > > 
> > > > > found-> vendor=0x1180, dev=0x0476, revid=0xb6
> > > > >         domain=0, bus=2, slot=4, func=0
> > > > >         class=06-07-00, hdrtype=0x02, mfdev=1
> > > > >         cmdreg=0x0007, statreg=0x0210, cachelnsz=0 (dwords)
> > > > >         lattimer=0x40 (1920 ns), mingnt=0x80 (32000 ns), maxlat=0x07 (1750 ns)
> > > > >         intpin=a, irq=10
> > > > >         powerspec 2  supports D0 D1 D2 D3  current D0
> > > > >         map[10]: type Memory, range 32, base 0xcc100000, size 12, enabled
> > > > > pcib4: failed to allocate initial memory window (0xcc100000-0xcc1fffff,0x100000)
> > > > > pcib4: matched entry for 2.4.INTA
> > > > > pcib4: slot 4 INTA hardwired to IRQ 20
> > > > > cbb0: <RF5C476 PCI-CardBus Bridge> irq 20 at device 4.0 on pci2
> > > > > pcib0: allocated type 3 (0xcc500000-0xcc5fffff) for rid 20 of pcib4
> > > > > pcib4: allocated initial memory window of 0xcc500000-0xcc5fffff
> > > > > pcib4: allocated memory range (0xcc500000-0xcc500fff) for rid 10 of cbb0
> > > > > cbb0: Lazy allocation of 0x1000 bytes rid 0x10 type 3 at 0xcc500000
> > > > > 
> > > > > So the second case actually recovers and allocates a different range.
> > > > > 
> > > > > Can you try booting with 'debug.acpi.disabled=sysres' set in the loader?
> > > > 
> > > > You mean without the patch?
> > > 
> > > Either way.
> > > 
> > > > > Also, can you get the output of 'devinfo -rv' from a working kernel?
> > > 
> > > Oops, I meant to ask for devinfo -u, sorry. :(
> > > 
> > > Oh, I see it now.  Your BIOS is broken.
> > > 
> > > The hdac0 device is assigned a resource that conflicts with pcib4, though
> > > that is the one we recover from:
> > > 
> > > >         hdac0 pnpinfo vendor=0x1002 device=0x4383 subvendor=0x103c subdevice=0x30c2 class=0x040300 at slot=20 function=2 
> handle=\_SB_.C08B.C0FD
> > > >             Interrupt request lines:
> > > >                 16
> > > >             I/O memory addresses:
> > > >                 0xcc100000-0xcc103fff
> > > 
> > > For the CardBus Bridge, the issue is this device:
> > > 
> > > >         ahci0 pnpinfo vendor=0x1002 device=0x4380 subvendor=0x1002 subdevice=0x4380 class=0x01018f at slot=18 function=0 
> handle=\_SB_.C08B.C275
> > > >             Interrupt request lines:
> > > >                 16
> > > >             I/O ports:
> > > >                 0x5018-0x501b
> > > >                 0x5020-0x502f
> > > >                 0x9000-0x9007
> > > >                 0x9008-0x900b
> > > >                 0x9010-0x9017
> > > >             I/O memory addresses:
> > > >                 0xcc409000-0xcc4093ff
> > > 
> > > That last memory BAR conflicts with the desired range of 0xcc408000-0xcc40c000.
> > > 
> > > I'm not sure why BIOS writers are so grossly incompetent, but such is life.
> > > 
> > > Try this:
> > > 
> > > Index: pci.c
> > > ===================================================================
> > > --- pci.c	(revision 235475)
> > > +++ pci.c	(working copy)
> > > @@ -2815,13 +2815,36 @@ pci_add_map(device_t bus, device_t dev, int reg, s
> > >  	 */
> > >  	res = resource_list_reserve(rl, bus, dev, type, &reg, start, end, count,
> > >  	    prefetch ? RF_PREFETCHABLE : 0);
> > > +	if (res == NULL && (start != 0 || end != ~0ul)) {
> > > +		/*
> > > +		 * If the allocation fails, try to allocate a resource for
> > > +		 * this BAR using any available range.  The firmware felt
> > > +		 * it was important enough to assign a resource, so don't
> > > +		 * disable decoding if we can help it.
> > > +		 */
> > > +		resource_list_delete(rl, type, reg);
> > > +		start = 0;
> > > +		end = ~0ul;
> > > +		resource_list_add(rl, type, reg, 0, ~0ul, count);
> > > +		resource_list_add(rl, type, reg, start, end, count);
> > > +		res = resource_list_reserve(rl, bus, dev, type, &reg, 0, ~0ul,
> > > +		    count, prefetch ? RF_PREFETCHABLE : 0);
> > > +	}
> > >  	if (res == NULL) {
> > >  		/*
> > >  		 * If the allocation fails, delete the resource list entry
> > > -		 * to force pci_alloc_resource() to allocate resources
> > > -		 * from the parent.
> > > +		 * and disable decoding for this device.
> > > +		 *
> > > +		 * If the driver requests this resource in the future,
> > > +		 * pci_reserve_map() will try to allocate fresh resources.
> > >  		 */
> > >  		resource_list_delete(rl, type, reg);
> > > +		pci_disable_io(dev, type);
> > > +		start = 0;
> > > +		device_printf(bus,
> > > +		    "pci%d:%d:%d:%d bar %#x failed to allocate",
> > > +		    pci_get_domain(dev), pci_get_bus(dev), pci_get_slot(dev),
> > > +		    pci_get_function(dev), reg);
> > >  	} else {
> > >  		start = rman_get_start(res);
> > >  		pci_write_bar(dev, pm, start);
> > > 
> > > 
> > > -- 
> > > John Baldwin
> > 
> > Below are (1) verbose boot dmesg and
> > (2) devinfo -u, with your patch
> > and with 'debug.acpi.disabled=sysres'
> > set in the loader.
> 
> Humm, so did the patch help at all?  (Wasn't clear from your e-mail.)
> 
> It seems to have not helped given that the BAR still failed to allocate?

er.. yes, of course it helped.

My problem was that I couldn't boot.
So, I presumed the very existence of dmesg.boot
showed that your patches (both of them) work fine.
But, sorry, I could've been more explicit.
All seems to work, including sound and wireless.

Thanks

-- 
Anton Shterenlikht
Room 2.6, Queen's Building
Mech Eng Dept
Bristol University
University Walk, Bristol BS8 1TR, UK
Tel: +44 (0)117 331 5944
Fax: +44 (0)117 929 4423



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120516153019.GB9070>