Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 19 Mar 2005 05:22:31 -0600
From:      David Kelly <dkelly@hiwaay.net>
To:        freebsd-questions@freebsd.org
Subject:   Re: Serious issue with SATA disks again
Message-ID:  <20050319112231.GA35477@Grumpy.DynDNS.org>
In-Reply-To: <583197724.20050319103813@wanadoo.fr>
References:  <583197724.20050319103813@wanadoo.fr>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, Mar 19, 2005 at 10:38:13AM +0100, Anthony Atkielski wrote:
> 
> I need to know what is causing these problems.  They have been reported
> for a year by various people on various configurations (different
> motherboards and chipsets).  I've seen lots of complaints and reports,
> but no solutions.  It's not hardware, so don't bother suggesting that
> unless you can _prove_ that the OS is eliminated from consideration.

Its impossible to _prove_ the software is _not_ at fault just as its
impossible to prove the hardware is not at fault. When software works
for others but not on your hardware then one can only conclude there is
_something_ about your hardware.

With seemingly random timeouts such as you are seeing I would suspect
the SATA cable. SATA runs gigabits/sec and could be very sensitive. Try
a different cable from another source.

Also run the HD manufacturer's test utility. This week a bad block
appeared on one of my SATA drives after 4,000 hours of runtime.
Downloaded a bootable ISO from Hitachi Global Storage which booted into
DOS, found and remapped my bad block.

"smartctl" from ports was also quite useful at reading the error log
maintained by the HD firmware. Interesting reading, such as my drive
temperature was 35, lifetime max/min was 19/45 (Celsius).

> Doesn't anyone actually know how FreeBSD works?  Someone wrote the code
> that prints the above cryptic messages.  What do they mean, _exactly_?

It means the driver asked the HD to fill a buffer, but it didn't
complete the task within alloted time. Either the drive didn't begin, or
data was lost and fell short.

> I'm beginning to get the impression that support for disks is rather
> weak in FreeBSD 5.x.  I have mysterious SCSI errors on one machine that
> nobody seems to have any clue about, and mysterious SATA errors on
> another machine that nobody seems to have any clue about.  I can't
> really brag about the reliability or uptime of the OS if it crashes once
> a week due to unresolved bugs in disk-handling code.

A few years ago one of my then-new machines could not write a floppy in
FreeBSD but could in NT4. Tried lots of things, also got the attention
of the floppy driver maintainer. A few weeks later got the idea to
"Reset to Defaults" in the BIOS. Then reset the few specific things I
needed back the way they were. Magic. There was something undocumented
being set by BIOS at boot that didn't bother NT.

One of my BIOS settings above was to hold PCI back to version 2.0 or 2.1
vs 2.1 or 2.2. Learned on of my PCI cards didn't like something about
the new PCI spec and that the system was not smart enough to know.

More recently, in 5.2.1, I had no problems with a parallel ATA drive
with Hyperthreading enabled on my P4. No problems running sysinstall to
prep the new SATA drives. But the SATA drives locked the kernel solid
moments after first use. Disabled HT and all was fine. Something about
HT and the new Geom framework used for SATA (but not for PATA, at least
then) didn't work. Until a block went bad on one drive there hasn't been
a drive problem in 4,000+ hours. I only reboot for power failures and
updates.

-- 
David Kelly N4HHE, dkelly@HiWAAY.net
========================================================================
Whom computers would destroy, they must first drive mad.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20050319112231.GA35477>