From owner-aic7xxx Tue Sep 22 07:34:04 1998 Return-Path: Received: (from majordom@localhost) by hub.freebsd.org (8.8.8/8.8.8) id HAA13097 for aic7xxx-outgoing; Tue, 22 Sep 1998 07:34:04 -0700 (PDT) (envelope-from owner-aic7xxx@FreeBSD.ORG) Received: from einstein.phy.duke.edu (einstein.phy.duke.edu [152.3.182.4]) by hub.freebsd.org (8.8.8/8.8.8) with ESMTP id HAA13005 for ; Tue, 22 Sep 1998 07:33:59 -0700 (PDT) (envelope-from rgb@phy.duke.edu) Received: from ganesh.phy.duke.edu (rgb@ganesh.phy.duke.edu [152.3.183.52]) by einstein.phy.duke.edu (8.8.8/8.8.8) with ESMTP id KAA06341; Tue, 22 Sep 1998 10:33:25 -0400 (EDT) Received: from localhost (rgb@localhost) by ganesh.phy.duke.edu (8.8.5/8.8.5) with SMTP id KAA15016; Tue, 22 Sep 1998 10:33:24 -0400 X-Authentication-Warning: ganesh.phy.duke.edu: rgb owned process doing -bs Date: Tue, 22 Sep 1998 10:33:24 -0400 (EDT) From: "Robert G. Brown" To: Doug Ledford cc: aic7xxx Mailing List Subject: Re: One more 2300 healthy (rats?) In-Reply-To: <360755CC.C967DECE@dialnet.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-aic7xxx@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Tue, 22 Sep 1998, Doug Ledford wrote: > Robert G. Brown wrote: > > > Anyway, the 9/19 cam boot.flp image booted my "sick" PowerEdge 2300, > > found the 7890 device, and I could have proceeded to install freebsd if > > I had any idea how to install freebsd from just one floppy. It also > > rewrote whatever internal register was causing the trouble, as > > 5.1.0pre10 booted flawlessly immediately thereafter. I have to confess > > that I'm amazed that there is something that critical to function that > > isn't cleared by a power cycle (including one where I jumpered NVRAM > > clear and pulled the plug and punched the power button and...), but > > there it is. > > Now, the next real question is, if you go ahead and power that system down, > unplug the cord, discharge the capacitors in the power supply, clear the > NVRAM, and do everything else you cna imagine to make that system go back to > original, will the pre10 boot up without first booting the FreeBSD-CAM > floppy? If so, then I'm stumped and amazed. To the best of my knowledge, > everything the FreeBSD ahc cam driver sets and everything the linux aic7xxx > driver set are *all* volatile registers and locations in the sense that they > go away with a power down. So, if the machine is "fixed" so to speak now > and no longer needs pre-booted after a power down to get pre10 to work, then > something weird is going on. Any clues what it might be in that case > Justin? Aww, and I already started to use the machine (yes, Virginia, I do actually do MC simulations on these boxes...when they work). Sigh. I guess I do need to document that this recent "fix" via freebsd boot survives powerdown. I'll try the following: a) Power down, etc. and reboot as described. My prediction is that it will now work fine, because pre10 worked fine on systems that had never been powered up before or that had been unplugged and cleared -- as long as they were not already displaying this "hung" behavior. I really do think that something weird is going on because I see differences in the boot-time behavior of nominally "identical" machines -- something that shakes my belief in electronic determinism (a thing that is none too strong anyway;-). Allowing that any notion of a WinDell "conspiracy" is nonsense (it was intended as a tongue-in-cheek joke in the first place, and now of course Dell is working actively with the linux community) there still appears to be solid evidence that there is a non-volatile location in the 7890 subsystem on these systems that survives total powerdown, the placement of the NVRAM-clearing jumper, an adaptec-bios reset (in the card bios itself) and the POST/initialization process, whatever it might be. Here I'm at a disadvantage -- lacking device specs I cannot speculate where such a location might be or how it gets corrupted, but it does appear that it was corrupted in the Dells on delivery and gets reset by WinNT and now freebsd on boot, but not by pre10. b) So, I'll also try to power down, etc. and reboot an earlier image, maybe pre3 or the like, that installed but then messed up. By looking at what a revision writes that "causes" the problem and what a revision writes that leaves the problem alone, it may be possible to find a location that was written to -- wrong -- that is now not written to at all and that needs to be written to right. This is going to be tricky, as of four systems b1-b4 that I installed at the same time and on the same day with the same image, b3 and b4 are still running (a guy here has had them running a calculation the entire time so I haven't been able to reinstall/reboot them). It could be (if he ever finishes;-) that when I power THEM down they won't come back -- that was my experience with pre3 and at least one machine with pre7. I haven't had any opportunity to properly verify that the corruption problem itself is reproducible, as only yesterday did I manage to fix one that was corrupted! > Yeah, don't "fix" that machine just yet :) If you can, run the test above > first. It shall be carefully preserved in its dysfunctional state, except that I will let Dell replace the bad RAM. Hopefully it won't just suddenly start to work when they do... rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-aic7xxx" in the body of the message