Date: Fri, 22 Jan 2010 13:30:12 -0500 From: Toby Burress <kurin@delete.org> To: freebsd-questions@freebsd.org Subject: Drive errors in raidz array Message-ID: <20100122183012.GD6476@lithium.delete.org>
next in thread | raw e-mail | index | archive | help
I have a system with 24 drives in raidz2. When testing with bonnie++ it seemed to work fine (although I had to raise the arc_max to prevent kernel panics). However, now we're copying data to it and dmesg is showing many errors like: mpt0: mpt_cam_event: 0x16 mpt0: request 0xffffff80005f3840:63495 timed out for ccb 0xffffff000988f800 (req->ccb 0xffffff000988f800) mpt0: request 0xffffff80005f1f80:63496 timed out for ccb 0xffffff00098d0800 (req->ccb 0xffffff00098d0800) mpt0: attempting to abort req 0xffffff80005f3840:63495 function 0 mpt0: request 0xffffff8000601ee0:63497 timed out for ccb 0xffffff011edaa800 (req->ccb 0xffffff011edaa800) mpt0: request 0xffffff80005f4ec0:63498 timed out for ccb 0xffffff011eda5800 (req->ccb 0xffffff011eda5800) mpt0: mpt_wait_req(1) timed out mpt0: mpt_recover_commands: abort timed-out. Resetting controller mpt0: mpt_cam_event: 0x0 mpt0: completing timedout/aborted req 0xffffff80005f3840:63495 mpt0: completing timedout/aborted req 0xffffff80005f1f80:63496 mpt0: completing timedout/aborted req 0xffffff8000601ee0:63497 mpt0: completing timedout/aborted req 0xffffff80005f4ec0:63498 followed by (da0:mpt0:0:1:0): READ(10). CDB: 28 0 1 23 81 6f 0 0 2b 0 (da0:mpt0:0:1:0): CAM Status: SCSI Status Error (da0:mpt0:0:1:0): SCSI Status: Check Condition (da0:mpt0:0:1:0): UNIT ATTENTION asc:29,0 (da0:mpt0:0:1:0): Power on, reset, or bus device reset occurred (da0:mpt0:0:1:0): Retrying Command (per Sense Data) for every drive in the array. Additionally, zpool scrub says: pool: backups state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: resilver completed after 0h0m with 0 errors on Thu Jan 21 23:15:36 2010 I'm using 8.0-RELEASE-p2 on amd64. One other thing that changed between testing with bonnie++ and now is that I used glabel to label the drives before I put them in the raidz array. There is no raid controller. Is this something anyone has seen before? Googling around shows some similar errors but no solutions.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100122183012.GD6476>