Date: Mon, 07 Apr 2025 16:36:31 +0000 From: "Dave Cottlehuber" <dch@skunkwerks.at> To: "Andrea Venturoli" <ml@netfence.it>, "Mike Tancsa" <mike@sentex.net>, freebsd-questions <freebsd-questions@freebsd.org> Subject: Re: Sudden zpool checksums errors Message-ID: <3b4abe36-7fda-4ab7-8893-d8fe07288ab7@app.fastmail.com> In-Reply-To: <0e703e40-1d87-4c4b-a2b1-f370933f713a@netfence.it> References: <6aeb488d-b3c3-4393-80ca-0b89c1ebc446@netfence.it> <3ddfecf7-2cb3-472c-bfce-93356e57b898@app.fastmail.com> <032776db-a8a1-4134-a395-a59effbc4c30@netfence.it> <4c6b64ec-0e59-4f64-8faf-117c7686a87d@sentex.net> <0e703e40-1d87-4c4b-a2b1-f370933f713a@netfence.it>
index | next in thread | previous in thread | raw e-mail
On Mon, 7 Apr 2025, at 15:15, Andrea Venturoli wrote: > On 4/7/25 15:07, mike tancsa wrote: > All "non-error" drives report: > SCT Error Recovery Control: > Read: Disabled > > Write: Disabled > > All "error" drives report: > SCT Error Recovery Control: > Read: 655 (65.5 seconds) > Write: 670 (67.0 seconds) > > I wonder if this could be the culprit... > I guess I should enable or disable it on all drives; however I've been > reading mixed opinions on whether this is good or bad for ZFS. > > Any suggestion? I would have a short timeout and rely on zfs to handle cleanup. The thinking is that it is better for latency to return (failed) fast, and let zfs give the correct data, then clean up afterwards, than potentially have the entire drive be marked offline by zfs because of the longer delay time. Does this seem reasonable? https://github.com/AMDmi3/scterc-rc.d & https://forums.truenas.com/t/checking-for-tler-erc-etc-support-on-a-drive/1497 may be useful. A+ Davehelp
Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3b4abe36-7fda-4ab7-8893-d8fe07288ab7>
