Date: Tue, 9 Dec 2014 11:08:57 +0100 From: Kai Gallasch <k@free.de> To: freebsd-stable@freebsd.org Subject: Re: 10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout Message-ID: <20141209110857.6bb9bcbd@orwell> In-Reply-To: <20141209090157.M14881@martymac.org> References: <20141106003240.344dedf6@orwell> <545AB64F.1060502@multiplay.co.uk> <20141106012739.509b96b5@orwell> <545ACCEF.5000300@multiplay.co.uk> <20141209093405.6dd2c268@orwell> <20141209090157.M14881@martymac.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Sig_/4ki=HT6Fq8P1xA3z3vESpzF Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Am Tue, 9 Dec 2014 09:04:26 +0000 (UTC) schrieb "Ganael LAPLANCHE" <ganael.laplanche@martymac.org>: > On Tue, 9 Dec 2014 09:34:05 +0100, Kai Gallasch wrote >=20 > Hi Kai, >=20 > > Any ideas (left) ? >=20 > There is a PR for AHCI timeouts : >=20 > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D195349 >=20 > I don't know if it is related to your problem but maybe you can try > the suggested workaround ? Thank you for this information. But no. My problem seems to be unrelated.. K. echo 'hint.ahci.0.msi=3D"0"' >> /boot/loader.conf After reboot: # zpool scrub ssdpool # zpool status ssdpool pool: ssdpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: scrub in progress since Tue Dec 9 10:36:24 2014 5.36G scanned out of 115G at 166M/s, 0h11m to go 24.5K repaired, 4.65% done config: NAME STATE READ WRITE CKSUM ssdpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssdpool0 ONLINE 0 0 13 (repairing) gpt/ssdpool1 ONLINE 0 0 0 errors: No known data errors After the zpool scrub finished: # zpool status ssdpool pool: ssdpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: scrub repaired 38.5K in 0h9m with 0 errors on Tue Dec 9 10:45:58 2014 config: NAME STATE READ WRITE CKSUM ssdpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssdpool0 ONLINE 0 0 15 gpt/ssdpool1 ONLINE 0 0 4 # zpool clear ssdpool # zpool scrub ssdpool This "zpool scrub" run one SSD drive is lost during the scrub :-/ A "camcontrol rescan all" does not bring the missing ssd drive back.. # zpool status ssdpool pool: ssdpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub canceled on Tue Dec 9 10:58:42 2014 config: NAME STATE READ WRITE CKSUM ssdpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 gpt/ssdpool0 ONLINE 0 0 0 2481016284460057031 UNAVAIL 297 215 47 was /dev/gpt/ssdpool1 ahcich3: Timeout on slot 24 port 0 ahcich3: is 00000000 cs fc00001f ss ff00001f rs ff00001f tfd 40 serr 000000= 00 cmd 0024d917 (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00= 00 00 (ada3:ahcich3:0:0:0): CAM status: Command timeout (ada3:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 0= 0 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 0= 0 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 0= 0 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retry was blocked ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <Samsung SSD 850 PRO 512GB EXM01B6Q> s/n S1SXNSAFA06835A detached ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 0= 0 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 0= 0 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Poll timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (ada3:ahcich3:0:0:0): SETFEATURES ENABLE RCACHE. ACB: ef aa 00 00 00 40 00 = 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Poll timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 000000= 00 cmd 0024c417 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 0001fff0 ss 0001fff0 rs 0001fff0 tfd 80 serr 000000= 00 cmd 0024c417 (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00= 00 00 (ada3:ahcich3:0:0:0): CAM status: Command timeout (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 15 3f 4a c6 40 1e 00 00 00= 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 ed c4 21 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 71 19 d3 40 04 00 00 01= 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 bb 43 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 45 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 46 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 47 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 48 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 49 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4a 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4b 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4c 23 40 20 00 00 0= 0 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): Periph destroyed ahcich3: AHCI reset: device not ready after 31000ms (tfd =3D 00000080) ahcich3: Poll timeout on slot 16 port 0 ahcich3: is 00000000 cs 00010000 ss 00000000 rs 00010000 tfd 80 serr 000000= 00 cmd 0024d017 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted --=20 PGP-KeyID =3D 0xE401B671927D4A5C I am a robot. --Sig_/4ki=HT6Fq8P1xA3z3vESpzF Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCgAGBQJUhso5AAoJEHBlTXxPsfWIUfAP/3PBpfvPXY/eCbXtF5Kh+08+ pTASCmwlAjyB2HGQG3MkknyhcDzLU9Anaht+R4qAiUF579i5Zzn/2Vg/516h+Fon HSIZPQUOrrvKbAqY5gAaAwLWiIBtphL1jt1tZBtLlDLA0cZ6DsRa7XWlsCrtoraV G54kTGczCc3gZR7m7QJr0b9CPA8DNp0dEUZVUpIN3fq7NRWXkp88TY7AJyHclkPz LHH8DMK8/uLwwCFkrmUIpYRdAsu1VdvWvYMDiFa36v+ZbbG+f1HPyJa6GiSIHX3C Tdrb5KWij8eNdRXFsoil4dTIQXfqdKOfgcVZVXI7cztI5LBqUrHNMp7hxgHvF8QT u7XPn+GhBMkbPCGZJCu5cl37gz7Mom2mHa96eMTj8Ezcbg8OJCtPCdkluDwEpaNK mxE8OprktNDRD9jcCUShC6ZzisEp0phWunobaO74pyZ3GSOwHnJBx8YOR56Gb7x9 KwF033fi3TRGvf32P0GlmQbJc2U1/q9tNmaoeubOsIjOliUE8EBKP/vu+zSj+3sY iwpvIjJFhhs42iAPrmESJzk3kwjZl2ZSHMLrQHLRlX07q1GVg6a5xKGJY1DyXtLR QA0XWbcKabXEb+Yt/AkMI1CKQp6fvJoJVcFij03Ln8iQ96Iiu8tlxNnkbsEsqRQ4 NRaJ8rchIfPs0HlF0Hqv =MglT -----END PGP SIGNATURE----- --Sig_/4ki=HT6Fq8P1xA3z3vESpzF--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141209110857.6bb9bcbd>