Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Sep 1998 09:47:21 -0500 (CDT)
From:      Doug Ledford <dledford@dialnet.net>
To:        "Robert G. Brown" <rgb@phy.duke.edu>
Cc:        aic7xxx Mailing List <AIC7xxx@FreeBSD.ORG>
Subject:   Re: One more 2300 healthy (rats?)
Message-ID:  <Pine.LNX.3.96.980922093922.9555B-100000@dledford.dialnet.net>
In-Reply-To: <Pine.LNX.3.96.980922100950.14937D-100000@ganesh.phy.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tue, 22 Sep 1998, Robert G. Brown wrote:

> Aww, and I already started to use the machine (yes, Virginia, I do
> actually do MC simulations on these boxes...when they work).  Sigh.
> I guess I do need to document that this recent "fix" via freebsd boot
> survives powerdown.  I'll try the following:
> 
>   a) Power down, etc. and reboot as described.  My prediction is that it
> will now work fine, because pre10 worked fine on systems that had never
> been powered up before or that had been unplugged and cleared -- as
> long as they were not already displaying this "hung" behavior. 
> 
> I really do think that something weird is going on because I see
> differences in the boot-time behavior of nominally "identical" machines
> -- something that shakes my belief in electronic determinism (a thing
> that is none too strong anyway;-).  Allowing that any notion of a
> WinDell "conspiracy" is nonsense (it was intended as a tongue-in-cheek
> joke in the first place, and now of course Dell is working actively with
> the linux community) there still appears to be solid evidence that there
> is a non-volatile location in the 7890 subsystem on these systems that
> survives total powerdown, the placement of the NVRAM-clearing jumper, an
> adaptec-bios reset (in the card bios itself) and the POST/initialization
> process, whatever it might be.

Those aren't the only possibilities.  It's actually entirely possible that
the bug could now lie outside the aic7xxx driver in some of the more
generic linux kernel code that touches or effects this chipset.
Possibilities include the generic PCI initialization code, chipset setup
code, etc.  It's possible that something in there could be causing these
problems.

>  Here I'm at a disadvantage -- lacking
> device specs I cannot speculate where such a location might be or how it
> gets corrupted, but it does appear that it was corrupted in the Dells on
> delivery and gets reset by WinNT and now freebsd on boot, but not by
> pre10.

I've got the docs and I can't find any location that would cause this :)
Of course, one thing I don't think you've tried is forcing the pci parity
checking off using the pci_parity boot option.  That could possibly make a
difference.

>   b) So, I'll also try to power down, etc. and reboot an earlier image,
> maybe pre3 or the like, that installed but then messed up.  By looking
> at what a revision writes that "causes" the problem and what a revision
> writes that leaves the problem alone, it may be possible to find a
> location that was written to -- wrong -- that is now not written to at
> all and that needs to be written to right.

I have a better idea.

> It shall be carefully preserved in its dysfunctional state, except that
> I will let Dell replace the bad RAM.  Hopefully it won't just suddenly
> start to work when they do...

I've made a few changes in preparation for pre11 (which isn't quite ready
yet), but some of those changes also have to do with the
aic7xxx=dump_card option.  I'll (off the list) send you a copy of my
current aic7xxx.c file.  Use it to boot on a dysfunctional machine, write
down the output from the dump_card option, then boot the freebsd floppy,
then reboot into the linux code and run the dump card again.  If it
magically starts working but the dump_card stuff doesn't show a change,
then it would *have* to be in more generic code outside the aic7xxx
driver.  Actually, you might have to do dump_card, then boot again to make
sure it doesn't work without the dump_card since that used to cause the
driver to hang but I think I have that fixed, I just haven't tested it
yet.  And after the FreeBSD boot disk you would also have to boot the
linux disk twice, once with and once without the dump_card to test things
(unless the second time around the dump_card boot goes ahead and comes up,
then we would know I fixed that problem as well).

--------------------------------------
 Doug Ledford  <dledford@dialnet.net>
  Opinions expressed are my own, but
     they should be everybody's.
--------------------------------------



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-aic7xxx" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.LNX.3.96.980922093922.9555B-100000>