Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 7 Jul 2000 13:36:53 -0700 (MST)
From:      John Reynolds~ <jreynold@sedona.ch.intel.com>
To:        questions@freebsd.org
Subject:   ad4: READ command timeout -- how to debug this?
Message-ID:  <14694.16229.197155.992592@hip186.ch.intel.com>

next in thread | raw e-mail | index | archive | help

Hello all,

I've recently built up a dual celeron (533) machine (Abit BP6) to serve as a
new firewall/gateway/server/etc. machine for my home network. I thought the
machine was rock stable but I've been running into periodic freezes--one of
which I finally saw something on the console (as there were never any messages
in the syslog--and no "panic" messages).

I was building world this morning (with -j8--which I'd done the day I upgraded
from 4.0-R to 4.0-STABLE) and in the middle the machine locked hard and I saw
this on the console:

 ad4: READ command timeout - resetting
 ata2: resetting devices..

and that's it--the HDD light was "on" continuously but the machine was just
toast. A hard reset and manual fsck (because things were pretty well hosed)
got it running back again, but I'm at a loss for how to debug this.

dmesg shows:

atapci1: <HighPoint HPT366 ATA66 controller> port 0xe000-0xe0ff,0xdc00-0xdc03,0x
d800-0xd807 irq 18 at device 19.0 on pci0
ata2: at 0xd800 on atapci1
atapci2: <HighPoint HPT366 ATA66 controller> port 0xec00-0xecff,0xe800-0xe803,0x
e400-0xe407 irq 18 at device 19.1 on pci0
ad4: 9765MB <Maxtor 51024U2> [19841/16/63] at ata2-master using UDMA66

Are these messages (not the dmesg output) indicative of a hardware problem?
Bad cabling? Cosmic rays?

I'm not overclocking the celerons or anything in the system, I've got the case
panels off the machine so I don't think heat is to blame here. I've done a
successful buildworld and two or three kernel builds along with building
several massive ports like all of GNOME 1.2, emacs, docproj (TeX, etc.) and it
all went smooth as silk.

What are the things I could try to isolate the problem? Go down into uniproc
mode? Change the drive over to the "1st" UDMA controller (I mistakingly
connected it to IDE4 which is why this single drive is showing up as ad4)?
Change to different disk mode (pio)? 

I think I will try a buildworld without the -jN mode just to see if it
finishes. Perhaps it was just too much "concurrent" disk activity for the
system to deal with .... (with the -j8 option to make).

Does anybody have any advice on troubleshooting this (besides using the ATA
disk as a boat anchor and getting a scsi drive--which I'm considering).

Thanks,

-Jr

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
| John Reynolds               WCCG, CCE, Higher Levels of Abstraction       |
| Intel Corporation   MS: CH6-210   Phone: 480-554-9092   pgr: 602-868-6512 |
| jreynold@sedona.ch.intel.com  http://www-aec.ch.intel.com/~jreynold/      |
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-questions" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?14694.16229.197155.992592>