From owner-freebsd-current@FreeBSD.ORG  Mon Nov 10 18:58:48 2003
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E46F916A4CE; Mon, 10 Nov 2003 18:58:48 -0800 (PST)
Received: from obsecurity.dyndns.org
	(adsl-63-207-60-234.dsl.lsan03.pacbell.net [63.207.60.234])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id B54AB43F3F; Mon, 10 Nov 2003 18:58:45 -0800 (PST)
	(envelope-from kris@obsecurity.org)
Received: by obsecurity.dyndns.org (Postfix, from userid 1000)
	id 0A87066B28; Mon, 10 Nov 2003 18:58:44 -0800 (PST)
Date: Mon, 10 Nov 2003 18:58:44 -0800
From: Kris Kennaway <kris@obsecurity.org>
To: "Andrew P. Lentvorski, Jr." <bsder@allcaps.org>
Message-ID: <20031111025844.GA17546@xor.obsecurity.org>
References: <XFMail.20031107140654.jhb@FreeBSD.org>
	<20031107202526.S532@mail.allcaps.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="Qxx1br4bt0+wmkIi"
Content-Disposition: inline
In-Reply-To: <20031107202526.S532@mail.allcaps.org>
User-Agent: Mutt/1.4.1i
cc: Kris Kennaway <kris@obsecurity.org>
cc: re@FreeBSD.org
cc: current@FreeBSD.org
cc: John Baldwin <jhb@FreeBSD.org>
cc: sos@FreeBSD.org
Subject: Re: Too many uncorrectable read errors with atang
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 11 Nov 2003 02:58:49 -0000


--Qxx1br4bt0+wmkIi
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Nov 07, 2003 at 08:36:28PM -0800, Andrew P. Lentvorski, Jr. wrote:
> On Fri, 7 Nov 2003, John Baldwin wrote:
>=20
> > On 07-Nov-2003 Kris Kennaway wrote:
> > > So far this has happened (well, the panic above was new) on 5 separate
> > > machines that were all working on older -current.  Now, these are all
> > > IBM DeathStar drives, but previously I was only experiencing ata
> > > errors every month or two, and they were correctable for another month
> > > or two by /dev/zero'ing the drive.
>=20
> IBM Deathstar's have this annoying tendency to perform thermal
> recalibration cycles that cause them to delay returning data for somewhere
> between 30-90 seconds until the calibration finishes.  Unfortunately,
> these seem to show up as uncorrectable errors.  It's a true pain with RAID
> cards as the RAID array will take the drive offline when it could retry
> the data.
>=20
> If you can, try to reduce the temperature of the drives.  This generally
> helped my Deathstars before I got rid of them all.
>=20
> Also, given the touchiness of PRML detectors, it is entirely possible that
> the drive is reading increased errors due to the solar flares as a need to
> thermally recalibrate more often.
>=20
> Other than tossing the drives, ATAng, like Windows, would have to be more
> aggressive about retrying even uncorrectable errors for up to a minute or
> so before giving up.

It looks like my drives are indeed dying..reverting to 5.1-RELEASE
still gives lots of errors on 2 of the machines.  I guess ATAng is
more sensitive to errors on the others.

Kris

--Qxx1br4bt0+wmkIi
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/sFBkWry0BWjoQKURAtrhAJ9uoEYlreAYD5bDxLsZWJHe+SM3cQCfT0zs
KfksxbwZSj0tU4QPlsFNzks=
=Z8BD
-----END PGP SIGNATURE-----

--Qxx1br4bt0+wmkIi--