Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Aug 2012 08:18:20 -0400
From:      John Baldwin <jhb@freebsd.org>
To:        Mike A <mikea@mikea.ath.cx>
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: Bug Report: IBM x3650M4 (32GB, 2x4-core Xeon E5-2600, IBM ServeRaid M5110e): fails in install with NMI
Message-ID:  <201208290818.20990.jhb@freebsd.org>
In-Reply-To: <20120828210618.GD69985@mikea.ath.cx>
References:  <20120827203817.GB44988@mikea.ath.cx> <201208281238.48041.jhb@freebsd.org> <20120828210618.GD69985@mikea.ath.cx>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday, August 28, 2012 5:06:18 pm Mike A wrote:
> On Tue, Aug 28, 2012 at 12:38:47PM -0400, John Baldwin wrote:
> > 
> > When the loader menu pops up, choose the "escape to loader prompt" option,
> > then type 'set hint.mpt.0.msi_enable=0' followed by 'boot'.  There's no
> > guarantee this will help, btw, just something to try out first.
> > 
> > If that doesn't work, you can also try setting 'machdep.kdb_on_nmi=0' using
> > the same trick.
> > 
> > If that still doesn't help, please boot another OS that does and get the
> > output of 'lspci -v' or 'pciconf -lvb' or equivalent so we can see exactly
> > which mpt adapter it is.  I think there is one class of mpt(4) cards that
> > we do not yet support properly.  Ah, yes, this PR:
> > 
> > http://www.freebsd.org/cgi/query-pr.cgi?pr=149220
> > 
> > I think this may in fact be your adapter.  This was fixed after 9.0, so try
> > a 9.1-RC1 install disk instead and see if it works better.
> 
> No joy. In sober fact, neither 9.1 nor 9.0 will even boot reliably to the
> point where the usual dmesg contents are displayed. About 90% of the time,
> 9.0 will hit the DVD reader for a while, then go quiescent, followed by
> the yellow LED signaling an NMI or other serious problem and the bright
> blue flashing LED signaling a halted machine. I have yet to get any display
> out of 9.1 at all. I have changed all the changeables I can: booted from a
> complete power-down, booted from a halted system, etc. I can't see anything
> that always leads to a display or to a failure to display.
> 
> It is interesting that a RedHad Enterprise Linux 5.1 (ancient!) DVD booted
> up first crack off the bat. It couldn't find any discs to install to,
> however, though it did inventory the SATA drives in its dmesg output.
> 
> I'm about to try a Knoppix DVD, and will post what PCI data I can get
> from that. 
> 
> I've entered the first loader hint and got no change in symptoms; since
> then, I have not been able to get another display in about 10 tries, and
> hence been unable to enter the first and second loader hints. At about 7
> minutes per try, this is enormously frustrating.
> 
> If there is a way to instrument the CD/DVD boot process itself, so that I
> can see what leads up to the failure to display, I am greatly interested
> in doing this. My employer has about $40K invested in these boxes, and
> is interested in getting some good out of them; I'm at least equally
> interested in not annoying my boss. You can have pretty much 100% of my
> work time until I get them on the air or give up and run some flavor of
> Linux; I'd really rather not run Linux.
> 
> At this point I don't know whether the problems stem from the RAID adapter
> hosing the CD/DVD boot process, or from some other impediment. It may be
> that this belongs in the amd64 group, instead of the scsi group. I don't
> see a way to tell until I (or you) can determine the cause of the CD/DVD
> boot problems.
> 
> Thanks so much for your help so far. 

Humm, that is bizarre.  All the early bootstrap code just relies on the BIOS
to perform disk I/O, etc.  Can you PXE boot these machines?  That might be a
way to get the CD out of the picture.  I haven't seen any machines with your
symptoms.  At the least, if a machine does have a problem with the boot process
due to a bug or some such, it is consistent in having the problem every time,
not suddenly failing after working.

Also, to be honest, the original NMI in itself is a bit odd.  If you are having
these problems now I do wonder if there isn't an underlying hardware issue.
Regardless, I think netbooting would be a good thing to look to get the CD/DVD
bit out of the way.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201208290818.20990.jhb>