Date: Wed, 13 Apr 2011 00:08:58 +0300 From: Alexander Motin <mav@FreeBSD.org> To: pyunyh@gmail.com Cc: FreeBSD-Current <freebsd-current@freebsd.org>, David Naylor <naylor.b.david@gmail.com> Subject: Re: [regression] unable to boot: no GEOM devices found. Message-ID: <4DA4BF6A.7010806@FreeBSD.org> In-Reply-To: <20110412210354.GC1421@michelle.cdnetworks.com> References: <mailpost.1302585106.8448174.20731.mailing.freebsd.current@FreeBSD.cs.nctu.edu.tw> <4DA3EE8F.8050306@FreeBSD.org> <201104122132.23809.naylor.b.david@gmail.com> <4DA4B247.6010901@FreeBSD.org> <20110412210354.GC1421@michelle.cdnetworks.com>
next in thread | previous in thread | raw e-mail | index | archive | help
YongHyeon PYUN wrote: > On Tue, Apr 12, 2011 at 11:12:55PM +0300, Alexander Motin wrote: >> David Naylor wrote: >>> On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote: >>>> David Naylor wrote: >>>>> I am running -current and since a few days ago (at least 2011/04/11) I am >>>>> unable to boot. >>>>> >>>>> The boot process stops when it looks to find a bootable device. The >>>>> prompt (when pressing '?') does not display any device and yielding one >>>>> second (or more) to the kernel (by pressing '.') does not improve the >>>>> situation. >>>>> >>>>> A known working date is 2011/02/20. >>>>> >>>>> I am running amd64 on a nVidia MCP51 chipset. >>>> MCP51... again... >>>> >>>>> I am willing to help any way I can. >>>> You could start from capturing and showing verbose dmesg. Full or at >>>> least in parts related to disks. >>> I captured the dmesg output for both the old (working) kernel and the new >>> (bad) kernel. See attached for the difference between the two. If you need >>> the full dmesg please let me know. >>> >>> One thing I found is that the old kernel would not boot if I simply rebooted >>> from the bad kernel. I had to do a hard power off before the old kernel would >>> work again. Is some device state surviving between reboots? >> +ata2: reiniting channel .. >> +ata2: SATA connect time=0ms status=00000113 >> +ata2: reset tp1 mask=01 ostat0=58 ostat1=00 >> +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 >> +ata2: reset tp2 stat0=50 stat1=00 devices=0x1 >> +ata2: reinit done .. >> +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0 >> >> As soon as all devices detected but not responding to commands, I would >> suppose that there is something wrong with ATA interrupts. There is a >> long chain of interrupt problems in this chipset. I have already tried >> to debug one case where ATA wasn't generating interrupts at all. >> Unfortunately, without success -- requests were executing, but not >> generating interrupts, it wasn't looked like ATA driver problem. >> >> What's about possible candidate to revision triggering your problem, I >> would look on this message: >> +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0 >> >> At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) and >> it is interrupt related. > > Does the driver disable MSI for MCP51? ata(4) doesn't uses MSI by default and I doubt this controller supports them any way. But if I am not mixing something, there were very strange situations with MSI on that chipset, when enabling them one one device caused interrupt problems on another. > I think jhb's patch fixed one MSI issue of all MCP chipset. I am not telling it is wrong. It could just trigger something. -- Alexander Motin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4DA4BF6A.7010806>