Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 21 Jun 2001 13:14:59 -0400
From:      Bill Moran <wmoran@iowna.com>
To:        freebsd-hackers@freebsd.org
Subject:   PANIC - 4.3-STABLE, suspecting ata controller
Message-ID:  <3B322B93.20994E19@iowna.com>

next in thread | raw e-mail | index | archive | help
Hello all,

I've been trying to resolve this for a few weeks now. Previous posts to
-questions have produced no useful info, and I haven't found anything in
the list archives that has helped.

This machine is running 4.3-STABLE cvsupped from May 21. I have
experienced the exact same problem with various versions of 4.2-RELEASE
and 4.3-RELEASE/STABLE.

The machine is a backup server. Nightly at 3:00AM it does either a rsync
or a rm && cp (only on Sundays) to keep the data synchronized, and the
machine can then be used for manually run multi-tape backups during the
day. The data involved is approx 30G.

The panic ALWAYS occurs during the transfer (via NFS). The remote
machine is mounted NFS to this machine and the data is simple copied (or
rsynced). The nature of the panic is rather strange: the machine will
run problem-free for a few days (as long as a week) and then experience
a panic. After this initial panic, it becomes completely unable to
complete a transfer operation (cp or rsync) without another panic. The
only way I've found to restore it to proper function is to newfs the
local filesystem that stores the data mirror. Cursory access to the
(apparently corrupt) filesystem does not cause a panic. After the newfs,
the system will run reliably for a few days, maybe a week, and then
another panic occurs and cycle restarts (the machine is not reliable
again until after a newfs) I've tried both softupdates and straight sync
on the filesystem with the same results.

I've set the system up to do crash dumps, but I'm not kernel hacker and
the information gdb gives me is beyond me. Here's what I see:

dmesg: _kvm_vatop: read: undefined error: 0
dmesg: kvm_read: invalid address (c0280074)

I checked the cvslog and there don't seem to be any changes in the kvm
files from May 21 till now, so I've ruled out trying _another_ cvsup for
the time being. 
Following the kernel debug info in the handbook isn't very helpful: (a
"where" command gives me only a hex address, which I don't know what to
do with)

This is running on an Asus A7V133 mobo. Hopefully the rest of the
details will be obvious from the dmesg output.

Since this email is already too big, I've made several files available
via http (on the client's web site - it's only a 512K link, so be gentle
;)

http://www.prioritydesigns.com/crashdata/ contains the files. I've put a
dmesg.out (obvious) as well as the kernel.0, vmcore.0 and kernel.debug
there, if anyone would like to run it through gdb themselves
(***WARNING*** vmcore.0 is 128M) (NOTE, the web server does not allow
directory listings, you'll have to access the files directly)

So ... I'm looking for any possible help in straightening this out.
Advice on how to run gdb more effectively is welcome, as well as anyone
who wants to do it themselves and suggest changes as a result. If you
need any more files/information, please let me know.
My primary goal is to get this machine operating reliably. My secondary
goal is to help identify and correct any problems in the FreeBSD code
(should that be the cause)

TIA,
Bill

-- 
If a bird in the hand is worth two in the bush,
then what can I get for two hands in the bush?

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3B322B93.20994E19>