Date: Wed, 20 Nov 2024 09:15:55 +0100 From: Peter Eriksson <pen@lysator.liu.se> To: FreeBSD FS <freebsd-fs@freebsd.org> Subject: Re: zfs snapshot corruption when using encryption Message-ID: <C0B8A0F3-F351-46FB-AA4E-B0C54C8661F9@lysator.liu.se> In-Reply-To: <6F51D3D6-D9A3-46D8-94F2-535262F43EF4@FreeBSD.org> References: <03E4CCF5-0F9A-4B0E-A9DA-81C7C677860C@FreeBSD.org> <Zy4XHlKodLu7utBa@int21h> <3E85AAAE-8B1E-47C7-B581-E3D98AB03907@FreeBSD.org> <Zy5kmpL_8dJh0AGZ@int21h> <F328561D-AD0A-475D-8E67-9DDD93468301@FreeBSD.org> <Zy7CB7trCVTD1fEv@int21h> <6F51D3D6-D9A3-46D8-94F2-535262F43EF4@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--Apple-Mail=_30ABFAAF-129A-4792-A3F9-88E6B923AB8E Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 I=E2=80=99m seeing something similar on one of our systems - the one = system where I=E2=80=99ve just now started trying to use ZFS native = encryption. Setup: FreeBSD 13.4-RELEASE-p1, 512GB RAM 3 Zpools:=20 zroot - mirror of two SSD drives ENCRYPTED - ZFS over GELI-encrypted SAS 10TB drives SEKUR01D1 - ZFS over SAS 18TB drives with ZFS encryption enabled for = individual filesystems - ZFS snapshots are taken every hour of the ENCRYPTED zpool. - zfs send is being done on some filesystem on the ENCRYPTED spool - A big =E2=80=9Ccp -a=E2=80=9D (about 70TB of files) of data from zfs = filesystems in ENCRYPTED to SEKUR01D1 filesystems is running. CKSUM errors pop up in zroot! Fixed some errors yesterday, ran =E2=80=98zpool scrub=E2=80=99 & = =E2=80=98zpool clear=E2=80=99 and got a clean bill of health: # zpool status -v zroot pool: zroot state: ONLINE scan: scrub repaired 0B in 00:07:15 with 0 errors on Tue Nov 19 = 21:46:36 2024 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4 ONLINE 0 0 0 diskid/DISK-PHDW817002MK150Ap4 ONLINE 0 0 0 errors: No known data errors This morning: # zpool scrub zroot # zpool status -v zroot pool: zroot state: ONLINE scan: scrub in progress since Wed Nov 20 08:11:31 2024 19.4G scanned at 6.48G/s, 772K issued at 257K/s, 49.3G total 0B repaired, 0.00% done, no estimated completion time config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4 ONLINE 0 0 0 diskid/DISK-PHDW817002MK150Ap4 ONLINE 0 0 0 errors: No known data errors # zpool status -v zroot pool: zroot state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 0B in 00:07:20 with 1 errors on Wed Nov 20 = 08:18:51 2024 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4 ONLINE 0 0 2 diskid/DISK-PHDW817002MK150Ap4 ONLINE 0 0 2 errors: Permanent errors have been detected in the following files: /var/audit/20241119235400.20241120000543 Snapshots & zfs send only being done on the =E2=80=9CENCRYPTED=E2=80=9D = zpool, not on =E2=80=9Czroot=E2=80=9D or =E2=80=9CSEKUR01D1=E2=80=9D. Ie = not on the zpool with the Zfs-native-encrypted filesystems. Not 100% sure it is related but something is fishy. This is a server = that has been running fine with GELI-encrypted disks for many years = now=E2=80=A6=20 - Peter > On 9 Nov 2024, at 15:53, Palle Girgensohn <girgen@FreeBSD.org> wrote: >=20 >=20 >=20 >> 9 nov. 2024 kl. 02:59 skrev void <void@f-m.fm>: >>=20 >> % zfs version >=20 > Ah, of course. >=20 > $ zfs version=20 > zfs-2.2.4-FreeBSD_g256659204 > zfs-kmod-2.2.4-FreeBSD_g256659204 >=20 > Palle >=20 --Apple-Mail=_30ABFAAF-129A-4792-A3F9-88E6B923AB8E Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 <html><head><meta http-equiv=3D"content-type" content=3D"text/html; = charset=3Dutf-8"></head><body style=3D"overflow-wrap: break-word; = -webkit-nbsp-mode: space; line-break: after-white-space;">I=E2=80=99m = seeing something similar on one of our systems - the one system where = I=E2=80=99ve just now started trying to use ZFS native = encryption.<div><br></div><div>Setup:</div><div>FreeBSD 13.4-RELEASE-p1, = 512GB RAM</div><div><br></div><div>3 Zpools: </div><div> = zroot - mirror of two SSD drives</div><div> ENCRYPTED - ZFS over = GELI-encrypted SAS 10TB drives</div><div> SEKUR01D1 - ZFS over SAS = 18TB drives with ZFS encryption enabled for individual = filesystems</div><div><br></div><div>- ZFS snapshots are taken every = hour of the ENCRYPTED zpool.</div><div>- zfs send is being done on some = filesystem on the ENCRYPTED spool</div><div>- A big =E2=80=9Ccp -a=E2=80=9D= (about 70TB of files) of data from zfs filesystems in ENCRYPTED to = SEKUR01D1 filesystems is running.</div><div><br></div><div>CKSUM errors = pop up in zroot!</div><div><br></div><div>Fixed some errors yesterday, = ran =E2=80=98zpool scrub=E2=80=99 & =E2=80=98zpool clear=E2=80=99 = and got a clean bill of health:</div><div><br></div><div><div># zpool = status -v zroot</div><div> pool: zroot</div><div> state: = ONLINE</div><div> scan: scrub repaired 0B in 00:07:15 with 0 = errors on Tue Nov 19 21:46:36 = 2024</div><div>config:</div><div><br></div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> </span>NAME = = STATE READ WRITE = CKSUM</div><div><span class=3D"Apple-tab-span" style=3D"white-space: = pre;"> </span>zroot = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> mirror-0 = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> ada0p4 = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> diskid/DISK-PHDW817002MK150Ap4 ONLINE = 0 0 = 0</div><div><br></div><div>errors: No known data = errors</div><div><br></div><div>This morning:</div><div><br></div><div># = zpool scrub zroot</div><div><br></div><div># zpool status -v = zroot</div><div> pool: zroot</div><div> state: = ONLINE</div><div> scan: scrub in progress since Wed Nov 20 = 08:11:31 2024</div><div><span class=3D"Apple-tab-span" = style=3D"white-space: pre;"> </span>19.4G scanned at 6.48G/s, 772K = issued at 257K/s, 49.3G total</div><div><span class=3D"Apple-tab-span" = style=3D"white-space: pre;"> </span>0B repaired, 0.00% done, no = estimated completion = time</div><div>config:</div><div><br></div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> </span>NAME = = STATE READ WRITE = CKSUM</div><div><span class=3D"Apple-tab-span" style=3D"white-space: = pre;"> </span>zroot = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> mirror-0 = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> ada0p4 = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> diskid/DISK-PHDW817002MK150Ap4 ONLINE = 0 0 = 0</div><div><br></div><div>errors: No known data = errors</div><div><br></div><div># zpool status -v zroot</div><div> = pool: zroot</div><div> state: ONLINE</div><div>status: One or more = devices has experienced an error resulting in data</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span>corruption. Applications may be affected.</div><div>action: = Restore the file in question if possible. Otherwise restore = the</div><div><span class=3D"Apple-tab-span" style=3D"white-space: = pre;"> </span>entire pool from backup.</div><div> see: = https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A</div><div> = scan: scrub repaired 0B in 00:07:20 with 1 errors on Wed Nov 20 08:18:51 = 2024</div><div>config:</div><div><br></div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> </span>NAME = = STATE READ WRITE = CKSUM</div><div><span class=3D"Apple-tab-span" style=3D"white-space: = pre;"> </span>zroot = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> mirror-0 = ONLINE = 0 0 0</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> ada0p4 = ONLINE = 0 0 2</div><div><span = class=3D"Apple-tab-span" style=3D"white-space: pre;"> = </span> diskid/DISK-PHDW817002MK150Ap4 ONLINE = 0 0 = 2</div><div><br></div><div>errors: Permanent errors have been detected = in the following files:</div><div><br></div><div> = = /var/audit/20241119235400.20241120000543</div><div><br></div><div><br></di= v><div>Snapshots & zfs send only being done on the =E2=80=9CENCRYPTED=E2= =80=9D zpool, not on =E2=80=9Czroot=E2=80=9D or =E2=80=9CSEKUR01D1=E2=80=9D= . Ie not on the zpool with the</div><div>Zfs-native-encrypted = filesystems.</div><div><br></div><div>Not 100% sure it is related but = something is fishy. This is a server that has been running fine with = GELI-encrypted disks for many years = now=E2=80=A6 </div><div><br></div><div>- = Peter</div></div><div><br></div><div><blockquote type=3D"cite"><div>On 9 = Nov 2024, at 15:53, Palle Girgensohn <girgen@FreeBSD.org> = wrote:</div><br class=3D"Apple-interchange-newline"><div><meta = http-equiv=3D"content-type" content=3D"text/html; charset=3Dus-ascii"><div= style=3D"overflow-wrap: break-word; -webkit-nbsp-mode: space; = line-break: after-white-space;"><br = id=3D"lineBreakAtBeginningOfMessage"><div><br><blockquote = type=3D"cite"><div>9 nov. 2024 kl. 02:59 skrev void = <void@f-m.fm>:</div><br = class=3D"Apple-interchange-newline"><div><span style=3D"caret-color: = rgb(0, 0, 0); font-family: Menlo-Regular; font-size: 13px; font-style: = normal; font-variant-caps: normal; font-weight: 400; letter-spacing: = normal; text-align: start; text-indent: 0px; text-transform: none; = white-space: normal; word-spacing: 0px; -webkit-text-stroke-width: 0px; = text-decoration: none; float: none; display: inline !important;">% zfs = version</span></div></blockquote></div><br><div>Ah, of = course.</div><div><br></div><div><div>$ zfs = version </div><div>zfs-2.2.4-FreeBSD_g256659204</div><div>zfs-kmod-2.= 2.4-FreeBSD_g256659204</div></div><div><br></div><div>Palle</div><div><br>= </div></div></div></blockquote></div><br></body></html>= --Apple-Mail=_30ABFAAF-129A-4792-A3F9-88E6B923AB8E--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C0B8A0F3-F351-46FB-AA4E-B0C54C8661F9>