Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 15 Sep 2009 17:54:36 +0200
From:      Pawel Jakub Dawidek <pjd@FreeBSD.org>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        Kris Kennaway <kris@FreeBSD.org>, FreeBSD Current <current@freebsd.org>
Subject:   Re: ata timeouts under load
Message-ID:  <20090915155436.GB2199@garage.freebsd.pl>
In-Reply-To: <4AAD5365.5000902@FreeBSD.org>
References:  <4AAD4E51.5060908@FreeBSD.org> <4AAD5365.5000902@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--A6N2fC+uXW/VQSAv
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Sun, Sep 13, 2009 at 11:17:41PM +0300, Alexander Motin wrote:
> Kris Kennaway wrote:
> > I am getting timeouts on 8.0b4/HEAD when I do a lot of ZFS I/O to a pool
> > on ad4:
> >=20
> > atapci0: <VIA 6420 SATA150 controller> port
> > 0xc800-0xc807,0xc400-0xc403,0xc000-0xc007,0xb800-0xb803,0xb400-0xb40f,0=
xb000-0xb0ff
> > irq 20 at device 15.0 on pci0
> > ata2: <ATA channel 0> on atapci0
> > ata3: <ATA channel 1> on atapci0
> > ata0: <ATA channel 0> on atapci1
> > ata1: <ATA channel 1> on atapci1
> >=20
> > ad4: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata2-master SATA150
> > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
> > completing request directly
> > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
> > completing request directly
> > ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing
> > request directly
> > ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing
> > request directly
> > ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
> > ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=3D344052040
> > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
> > completing request directly
> > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout -
> > completing request directly
> >=20
> > It becomes stuck in a loop displaying the above and is unable to
> > complete further I/O operations.  I wonder if it is just batching up a
> > lot of I/O and then timing out because it is busy, and then not
> > recovering from this state?
> >=20
> > Any ideas what could be wrong?
>=20
> There are two different kinds of timeouts we can see:
>  - first one, "ad4: WARNING - ..." is just a queue waiting timeout. It
> is not the reason, but consequence of the problem. And I have doubts
> that it is reasonable to do it.
>  - second one, "TIMEOUT - WRITE_DMA48 ..." is a real command execution
> timeout. I don't know whether this is result of some improper error
> recovery, or you drive indeed lost required servo information near
> LBA=3D344052040 and tries to find it too long. You can try to read that
> sector and nearby ones with dd.

Could this be related to BIO_FLUSH requests?

--=20
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd@FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

--A6N2fC+uXW/VQSAv
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFKr7i8ForvXbEpPzQRAjDaAKDKTb9Xl6KXNgVYQH7JrhaUBbIauwCgiZa6
kDiftv2qOLh6T0GYwbPV7Ag=
=7NCB
-----END PGP SIGNATURE-----

--A6N2fC+uXW/VQSAv--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20090915155436.GB2199>