From owner-freebsd-current@FreeBSD.ORG Wed Mar 28 18:43:08 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02450106566C; Wed, 28 Mar 2012 18:43:08 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id C99A18FC16; Wed, 28 Mar 2012 18:43:07 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 49A19B922; Wed, 28 Mar 2012 14:43:07 -0400 (EDT) From: John Baldwin To: David Naylor Date: Wed, 28 Mar 2012 14:37:35 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201104152329.59294.naylor.b.david@gmail.com> <201105092024.41588.naylor.b.david@gmail.com> In-Reply-To: <201105092024.41588.naylor.b.david@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201203281437.35746.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 28 Mar 2012 14:43:07 -0400 (EDT) Cc: Alexander Motin , FreeBSD-Current Subject: Re: [regression] unable to boot: no GEOM devices found. X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Mar 2012 18:43:08 -0000 On Monday, May 09, 2011 2:24:37 pm David Naylor wrote: > On Friday 15 April 2011 23:29:55 David Naylor wrote: > > On Friday 15 April 2011 18:28:06 John Baldwin wrote: > > > On Wednesday, April 13, 2011 1:07:06 pm David Naylor wrote: > > > > On Tuesday 12 April 2011 22:12:55 Alexander Motin wrote: > > > > > David Naylor wrote: > > > > > > On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote: > > > > > >> David Naylor wrote: > > > > > >>> I am running -current and since a few days ago (at least > > > > > >>> 2011/04/11) I am unable to boot. > > > > > >>> > > > > > >>> The boot process stops when it looks to find a bootable device. > > > > > >>> The prompt (when pressing '?') does not display any device and > > > > > >>> yielding > > > > > > one > > > > > > > > >>> second (or more) to the kernel (by pressing '.') does not improve > > > > > >>> the situation. > > > > > >>> > > > > > >>> A known working date is 2011/02/20. > > > > > >>> > > > > > >>> I am running amd64 on a nVidia MCP51 chipset. > > > > > >> > > > > > >> MCP51... again... > > > > > > > > > > +ata2: reiniting channel .. > > > > > +ata2: SATA connect time=0ms status=00000113 > > > > > +ata2: reset tp1 mask=01 ostat0=58 ostat1=00 > > > > > +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 > > > > > +ata2: reset tp2 stat0=50 stat1=00 devices=0x1 > > > > > +ata2: reinit done .. > > > > > +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0 > > > > > > > > > > As soon as all devices detected but not responding to commands, I > > > > > would suppose that there is something wrong with ATA interrupts. > > > > > There is a long chain of interrupt problems in this chipset. I have > > > > > already tried to debug one case where ATA wasn't generating > > > > > interrupts at all. Unfortunately, without success -- requests were > > > > > executing, but not generating interrupts, it wasn't looked like ATA > > > > > driver problem. > > > > > > > > > > What's about possible candidate to revision triggering your problem, > > > > > I would look on this message: > > > > > +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0 > > > > > > > > > > At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) > > > > > and it is interrupt related. > > > > > > > > I reverted those two revs and everything works again. > > > > > > Hmm, can you provide a full boot verbose dmesg? Alternatively, can you > > > see if the device at pci0:0:9:0 is a PCI-PCI bridge? > > > > I can provide a verbose dmesg if the following is not enough: > > > > none17@pci0:0:9:0: class=0x050000 card=0x50011458 chip=0x027010de > > rev=0xa2 hdr=0x00 > > vendor = 'NVIDIA Corporation' > > device = 'MCP51 Host Bridge' > > class = memory > > subclass = RAM > > > > I see two PCI-PCI bridges at pci0:0:3:0 and pci0:0:16:0. I've attached the > > full `pciconf -lv` output. > > FYI, this issue is still present on current (~24 hours old). Reverting the > above mentioned revisions still fixes the problem. I finally had an idea about a way to solve this (at least when using ACPI) that doesn't involve a whole bunch of quirks, etc. Please try http://www.FreeBSD.org/~jhb/patches/hostb_htmsi.patch -- John Baldwin