Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 18 Sep 2004 15:50:16 -0400
From:      Louis LeBlanc <FreeBSD@keyslapper.org>
To:        freebsd-hardware@freebsd.org
Cc:        FreeBSD@keyslapper.org
Subject:   Intel ICH5 SATA 150 disk controller support?
Message-ID:  <20040918195016.GB99010@keyslapper.org>

next in thread | raw e-mail | index | archive | help
More Dimension 8300 woes.  This is long, and I'm sorry, but I wanted
to include as much relevant info as possible.

My shiny new Dell Dimension 8300 continues to have problems writing to
the hard drive.  Over the last 3 months, it has regularly had problems
with DMA timeouts, and only the following sort of messages show up in
/var/log/messages:

Sep 17 19:41:33 key2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=57935679
Sep 17 19:41:39 key2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=57852767
Sep 17 19:43:06 key2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=69973311
Sep 17 19:43:12 key2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=58356223
Sep 17 19:43:18 key2 kernel: ad4: TIMEOUT - WRITE_DMA retrying (2 retries left) LBA=69206207

This is often followed by a complete lockup.  Hard reboot required.  I
have posted queries on freebsd-questions, where it was suggested that
I might have gotten an off the shelf dud for a hard drive.  I didn't
think this was likely, new drive, tested at the factory and all, so I
asked for the configuration approach to debugging the problem.   It
was suggested, among other things, that I put the drive in PIO mode,
which caused the machine to immediately lock up without further ado.
None of the other suggestions had much affect either way.

While investigating, I collected the following info on the drive and
controller:

<root># atacontrol cap 2 0
ATA channel 2, Master, device ad4:

ATA/ATAPI revision    6
device model          WDC WD1600JD-75HBB0
serial number         WD-WMAL91191824
firmware revision     08.02D08
cylinders             16383
heads                 16
sectors/track         63
lba supported         268435455 sectors
lba48 supported         312500000 sectors
dma supported
overlap not supported

Feature                      Support  Enable    Value   Vendor
write cache                    yes      yes
read ahead                     yes      yes
dma queued                     no       no      0/0x00
SMART                          yes      yes
microcode download             yes      yes
security                       no       no
power management               yes      yes
advanced power management      no       no      0/0x00
automatic acoustic management  yes      yes     128/0x80 128/0x80

As mentioned in previous messages to questions (search for the string
"TIMEOUT - WRITE_DMA" or "AARRRGGHHH!" to see the thread), the disk
controller is an Intel ICH5 SATA 150.  I was pretty sure this one was
supported, so I went to the FreeBSD site to verify it in the 5.2.1
hardware list, and it wasn't there.  I was positive I had seen it, so
I double checked the 4.10 controller list, and the ICH5 *is* listed.
I've found nothing in the hardware list archives relating to the write
dma timeout or the Intel ICH5 controller.

Here's the relevant info from /var/run/dmesg.boot:
atapci1: <Intel ICH5 SATA150 controller> port 0xfea0-0xfeaf,0xfe30-0xfe33,0xfe20-0xfe27,0xfe10-0xfe13,0xfe00-0xfe07 irq 18 at device 31.2 on pci0
atapci1: [MPSAFE]
ata2: at 0xfe00 on atapci1
ata2: [MPSAFE]
. . .
ad4: 152587MB <WDC WD1600JD-75HBB0> [310019/16/63] at ata2-master UDMA100

So, last night it locked up again, and as usual, the disk was all
fsck'd up when I hard cycled it.  Data was lost in /var, /usr, and
/home.  Yes, I have the most critical data backed up.

This time, I decided to get Dell involved, and spent awhile on the
phone with a very helpful hardware tech who was not at all concerned
with the fact that I removed WinXP without ever booting it. He
introduced me to a nifty little bios level diagnostic tool,  accessed
by hitting CTRL-ALT-D when the Dell logo flashes up at startup.  The
drive passed just fine, so the chances my problem is the disk is
pretty close to nil.

I wonder what other tools are available at boot up . . .

This system has encountered problems serious enough to require a full
install from scratch 3 times including last night (though I haven't
reinstalled it this time yet).  Several of my ports seem to have
stopped working this time, and even though I reinstalled them from
scratch, the browsers won't work.

I imagine it will be of interest, so I'll just say that I have turned
off softupdates for /, /usr, and /var, and have tried the bios level
DMA switched both on and off, and none of it seems to matter.

So my first question; is there any reason that ICH5 support would be
dropped from 5.2.1?  It doesn't really make sense to me for a new
controller type to be dropped unless there were licensing issues,
which doesn't seem likely.

Second of all, am I now pigeonholed into FreeBSD 4.10 (assuming I
refuse to run anything but *BSD on the machine)?  Is there any
expectation that 5.3 will once again include the ICH5 controller?

And finally, though I probably know the answer, is there any way to
get this support in the 5.x branch before 5.3 is released?  Any chance
this is already present in CURRENT?  I don't mind going that route
until 5.3 is released, but only if it will get me the device support I
need.  I can just wait until then to put the system in place as my
main server system (it won't be long, right?).

***\  I'm usually subscribed to freebsd-questions, but I'm not
*** > subscribed here, so I would appreciate being copied directly
***/  in your responses.

Thanks in advance.
Lou
-- 
Louis LeBlanc               FreeBSD@keyslapper.org
Fully Funded Hobbyist, KeySlapper Extrordinaire :)
http://www.keyslapper.org                     ԿԬ

Hacker's Law:
  The belief that enhanced understanding will necessarily stir
  a nation to action is one of mankind's oldest illusions.


Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20040918195016.GB99010>