FreeBSD Mail Archives

Date:      Sat, 27 Jun 1998 11:54:50 +0930
From:      Greg Lehey <grog@lemis.com>
To:        Donald Burr <dburr@POBoxes.com>, FreeBSD Hackers <hackers@FreeBSD.ORG>
Subject:   Re: odd problems with AMD K6
Message-ID:  <19980627115450.O16259@freebie.lemis.com>
In-Reply-To: <XFMail.980626105721.dburr@POBoxes.com>; from Donald Burr on Fri, Jun 26, 1998 at 10:57:21AM -0700
References:  <XFMail.980626105721.dburr@POBoxes.com>

index | next in thread | previous in thread | raw e-mail


On Friday, 26 June 1998 at 10:57:21 -0700, Donald Burr wrote
(to -hardware):

After reading this message, I don't think any of these are hardware
problems, so I'm following up to -hackers.

> I have just updated my system to an AMD K6/233.  Chip serial number is
> "C 9818 FPJW".  And, of course, my OS is FreeBSD 2.2.6-RELEASE.
>
> Some other details on my setup:
> Mobo: EFA E5TX-AT-5 "Pegasus I"
> Chipset: Intel 430TX (Triton); MTXC (82439TX)*1, PIIX4 (82371AB)* 1; I/O
> chipset is ALI M5135 (Yes, I am using the IWill sio patches...)
> OS: FreeBSD 2.2.6-RELEASE, with the IWill sio patches.
>
> I heard a while back about serious problems with the K6, so I searched the
> mailing list archives.  There were indeed problems, but AMD reportedly
> fixed them as of chip revision "B 9729 xxxx".  Since I have heard no new
> problem reports since then, I am assuming my chip is one of the "good"
> revisions.

I think this is a safe assumption.  Since the VM problems in early
versions, I haven't heard of any problems, and my own K6/233 is
running fine since I found an adequate cooling fan.  Your dmesg output
indicates that you have the same stepping as my chip.

> Anyway, the system appears to be working fine and all, that is, for normal
> usage.  However, the other day I tried a "make world" on the 2.2.6-RELEASE
> sources, and got an error.  Which leads into...
>
> 1.  Make world fails at *exactly* the same file, with *exactly* the same
>     error.
>
> I've run three consecutive 'make world's, and all of them fail, but they
> fail at *exactly* the same file, with *exactly* the same error.  Here is
> the log from one such session:
>
> ===> share/doc/papers/memfs
> indxbib -c
> /usr/src/share/doc/papers/memfs/../../../../contrib/groff/indxbib/eign -o
> ref.bib /usr/src/share/doc/papers/memfs/ref.bib
> indxbib in free(): warning: page is already free.
> indxbib in free(): warning: page is already free.
> vgrind -f < /usr/src/share/doc/papers/memfs/A.t > A.gt
> refer -n -e -l -s -p /usr/src/share/doc/papers/memfs/ref.bib
> /usr/src/share/doc/papers/memfs/0.t /usr/src/share/doc/papers/memfs/1.t
> A.gt > paper.t
> Failed assertion at line 161, file
> `/usr/src/gnu/usr.bin/groff/refer/../../../../contrib/groff/refer/token.cc'
> .
> Abort trap - core dumped
> *** Error code 134
>
> In all cases, the "refer" process dies with signal 6, according to the
> system logs (dmesg).

More to the point, it's voluntary.

> Since the problem occurs at exactly the same spot, with exactly the same
> error, I am leaning towards suspecting a problem on my installed system,
> rather than a hardware problem (since hardware trouble generally produces
> random, unpredictable errors).

A good assumption, up to a point.  If you built an executable with
flaky hardware, and the executable is broken as a result, you can
frequently get repeatable problems like this.  In this case, it can
also mean that the input file to refer is corrupted, which is what I
think is meant by the line:
   
A.gt > paper.t

I'd take a look at /usr/src/share/doc/papers/memfs/A.t if I were you.
It should have a size of 5077 bytes.

> 2.  Odd system crash -- once.
>
> When I said before that "the system appears to be working fine and all,"
> that was sort of a lie.  The system did crash, *ONCE*.  I have *NOT* been
> able to reproeduce this crash, however.
>
> What happened is this: I was in X, doing a *LOT* of things simultaneously
> (i.e. the system was heavily loaded) -- a bunch of usenet articles were
> being spooled in, I was running a make world, encoding some mp3's, the
> usual Netscape and email client, etc.  Then I started up XV to view some
> graphics that just came in.  The system froze, and rebooted.  The system
> did dump core, however, and this is the result (using kgdb):
>
> IdlePTD 279000
> current pcb at 25325c
> panic: vref used where vget required
> #0  0xf0116c5e in boot ()
> (kgdb) bt
> #0  0xf0116c5e in boot ()
> #1  0xf0116f4a in panic ()
> #2  0xf013cc07 in vref ()
> #3  0xf01043d0 in iso_iget ()
> #4  0xf010686a in cd9660_root ()
> #5  0xf013b6c0 in lookup ()
> #6  0xf013b04d in namei ()
> #7  0xf013fa04 in stat ()
> #8  0xf01f59a6 in syscall ()
> #9  0x2fc5 in ?? ()
> #10 0x107e in ?? ()
>
> Again, I have not been able to reproduce this.  I've run the system
> ragged since then, and it hasn't crashed a single time.

Doesn't look like hardware.  There have been some software problems in
this area.  Do you still have the dump?

> Now, last, but not least, I have a not-so-serious (but cosmetically
> ugly) problem:
>
> 3.  Dmesg output is slightly screwy.
>
> If I boot tje GENERIC kernel, the CPU type is properly detected and prints
> out properly:
>
> CPU: AMD-K6tm w/ multimedia extensions (233.86-MHz 586-class CPU)
>   Origin = "AuthenticAMD"  Id = 0x562  Stepping=2
>   Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
>
> however, if I boot my own custom kernel, it shows something really odd.
>
> CPU: \^E (233.86-MHz 586-class CPU)
>   Origin = "AuthenticAMD"  Id = 0x562  Stepping=2
>   Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
>
> Note the odd-looking CPU name...

Strange.

> Perhaps I'm doing something slightly wrong in my config?  A copy of it is
> attached for your viewing pleasure.

I can't think that this is something that could be influenced by the
kernel config.  I'd guess that you have some data corruption
somewhere.

Greg
--
See complete headers for address and phone numbers
finger grog@lemis.com for PGP public key

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

help

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?19980627115450.O16259>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation