Date: Tue, 9 Dec 2014 11:08:57 +0100 From: Kai Gallasch <k@free.de> To: freebsd-stable@freebsd.org Subject: Re: 10.1 RC4 r273903 - zpool scrub on ssd mirror - ahci command timeout Message-ID: <20141209110857.6bb9bcbd@orwell> In-Reply-To: <20141209090157.M14881@martymac.org> References: <20141106003240.344dedf6@orwell> <545AB64F.1060502@multiplay.co.uk> <20141106012739.509b96b5@orwell> <545ACCEF.5000300@multiplay.co.uk> <20141209093405.6dd2c268@orwell> <20141209090157.M14881@martymac.org>
index | next in thread | previous in thread | raw e-mail
[-- Attachment #1 --] Am Tue, 9 Dec 2014 09:04:26 +0000 (UTC) schrieb "Ganael LAPLANCHE" <ganael.laplanche@martymac.org>: > On Tue, 9 Dec 2014 09:34:05 +0100, Kai Gallasch wrote > > Hi Kai, > > > Any ideas (left) ? > > There is a PR for AHCI timeouts : > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195349 > > I don't know if it is related to your problem but maybe you can try > the suggested workaround ? Thank you for this information. But no. My problem seems to be unrelated.. K. echo 'hint.ahci.0.msi="0"' >> /boot/loader.conf After reboot: # zpool scrub ssdpool # zpool status ssdpool pool: ssdpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: scrub in progress since Tue Dec 9 10:36:24 2014 5.36G scanned out of 115G at 166M/s, 0h11m to go 24.5K repaired, 4.65% done config: NAME STATE READ WRITE CKSUM ssdpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssdpool0 ONLINE 0 0 13 (repairing) gpt/ssdpool1 ONLINE 0 0 0 errors: No known data errors After the zpool scrub finished: # zpool status ssdpool pool: ssdpool state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://illumos.org/msg/ZFS-8000-9P scan: scrub repaired 38.5K in 0h9m with 0 errors on Tue Dec 9 10:45:58 2014 config: NAME STATE READ WRITE CKSUM ssdpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/ssdpool0 ONLINE 0 0 15 gpt/ssdpool1 ONLINE 0 0 4 # zpool clear ssdpool # zpool scrub ssdpool This "zpool scrub" run one SSD drive is lost during the scrub :-/ A "camcontrol rescan all" does not bring the missing ssd drive back.. # zpool status ssdpool pool: ssdpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: scrub canceled on Tue Dec 9 10:58:42 2014 config: NAME STATE READ WRITE CKSUM ssdpool DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 gpt/ssdpool0 ONLINE 0 0 0 2481016284460057031 UNAVAIL 297 215 47 was /dev/gpt/ssdpool1 ahcich3: Timeout on slot 24 port 0 ahcich3: is 00000000 cs fc00001f ss ff00001f rs ff00001f tfd 40 serr 00000000 cmd 0024d917 (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Command timeout (ada3:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retry was blocked ada3 at ahcich3 bus 0 scbus3 target 0 lun 0 ada3: <Samsung SSD 850 PRO 512GB EXM01B6Q> s/n S1SXNSAFA06835A detached ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Retrying command ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Poll timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (ada3:ahcich3:0:0:0): SETFEATURES ENABLE RCACHE. ACB: ef aa 00 00 00 40 00 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Poll timeout on slot 4 port 0 ahcich3: is 00000000 cs 00000010 ss 00000000 rs 00000010 tfd 80 serr 00000000 cmd 0024c417 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted ahcich3: Timeout on slot 4 port 0 ahcich3: is 00000000 cs 0001fff0 ss 0001fff0 rs 0001fff0 tfd 80 serr 00000000 cmd 0024c417 (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 24 1b 4a c6 40 1e 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Command timeout (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 15 3f 4a c6 40 1e 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 ed c4 21 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 71 19 d3 40 04 00 00 01 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 f0 bb 43 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 45 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 46 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 47 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 48 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 49 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4a 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4b 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 e8 bb 4c 23 40 20 00 00 00 00 00 (ada3:ahcich3:0:0:0): CAM status: Unconditionally Re-queue Request (ada3:ahcich3:0:0:0): Error 5, Periph was invalidated (ada3:ahcich3:0:0:0): Periph destroyed ahcich3: AHCI reset: device not ready after 31000ms (tfd = 00000080) ahcich3: Poll timeout on slot 16 port 0 ahcich3: is 00000000 cs 00010000 ss 00000000 rs 00010000 tfd 80 serr 00000000 cmd 0024d017 (aprobe0:ahcich3:0:0:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich3:0:0:0): CAM status: Command timeout (aprobe0:ahcich3:0:0:0): Error 5, Retries exhausted -- PGP-KeyID = 0xE401B671927D4A5C I am a robot. [-- Attachment #2 --] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAEBCgAGBQJUhso5AAoJEHBlTXxPsfWIUfAP/3PBpfvPXY/eCbXtF5Kh+08+ pTASCmwlAjyB2HGQG3MkknyhcDzLU9Anaht+R4qAiUF579i5Zzn/2Vg/516h+Fon HSIZPQUOrrvKbAqY5gAaAwLWiIBtphL1jt1tZBtLlDLA0cZ6DsRa7XWlsCrtoraV G54kTGczCc3gZR7m7QJr0b9CPA8DNp0dEUZVUpIN3fq7NRWXkp88TY7AJyHclkPz LHH8DMK8/uLwwCFkrmUIpYRdAsu1VdvWvYMDiFa36v+ZbbG+f1HPyJa6GiSIHX3C Tdrb5KWij8eNdRXFsoil4dTIQXfqdKOfgcVZVXI7cztI5LBqUrHNMp7hxgHvF8QT u7XPn+GhBMkbPCGZJCu5cl37gz7Mom2mHa96eMTj8Ezcbg8OJCtPCdkluDwEpaNK mxE8OprktNDRD9jcCUShC6ZzisEp0phWunobaO74pyZ3GSOwHnJBx8YOR56Gb7x9 KwF033fi3TRGvf32P0GlmQbJc2U1/q9tNmaoeubOsIjOliUE8EBKP/vu+zSj+3sY iwpvIjJFhhs42iAPrmESJzk3kwjZl2ZSHMLrQHLRlX07q1GVg6a5xKGJY1DyXtLR QA0XWbcKabXEb+Yt/AkMI1CKQp6fvJoJVcFij03Ln8iQ96Iiu8tlxNnkbsEsqRQ4 NRaJ8rchIfPs0HlF0Hqv =MglT -----END PGP SIGNATURE-----help
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20141209110857.6bb9bcbd>
