Date: Wed, 27 Jan 2021 09:55:49 +0000 From: Matt Churchyard <matt.churchyard@userve.net> To: "freebsd-fs@freebsd.org" <freebsd-fs@freebsd.org> Subject: ZFS issues on 13-current snapshot Message-ID: <283b2b0b5df34c4f9b8c78b078a67381@SERVER.ad.usd-group.com>
next in thread | raw e-mail | index | archive | help
Hello, I'm testing a 13-current machine for future use as an encrypted offsite bac= kup store. As it's near release I was kind of hoping to get away with using= this 13 snapshot for a few months then switch to a RELEASE bootenv when it= comes out. However, I seem to be having a few issues. First of all I stated noticing that the USED & REFER columns weren't equal = for individual datasets. This system so far has simply received a single sn= apshot of a few datasets, and had readonly set immediately after. Some of t= hem are showing several hundred MB linked to snapshots on datasets that hav= en't been touched. I'm unable to send further snapshots without forcing a r= ollback first. Not the end of the world but this isn't right and has never = happened on previous ZFS systems. The most I've seen is a few KB because I = forgot to set readonly and went into a few directories on a dataset with at= ime=3Don. offsite 446G 6.36T 140K /offsite [...] offsite/secure/cms 359M 6.3= 6T 341M /offsite/secure/cms offsite/secure/cms@26-01-2021 17.6M = - 341M - offsite/secure/company 225G 6.3= 6T 224G /offsite/secure/company offsite/secure/company@25-01-2021 673M = - 224G - offsite/secure is an encrypted dataset using default options. zfs diff will sit for a while (on small datasets - I gave up trying to run = it on anything over a few GB) and eventually output nothing. root@offsite:/etc # uname -a FreeBSD offsite.backup 13.0-CURRENT FreeBSD 13.0-CURRENT #0 main-c255641-gf= 2b794e1e90: Thu Jan 7 06:25:26 UTC 2021 root@releng1.nyi.freebsd.org:/= usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 root@offsite:/etc # zpool version zfs-0.8.0-1 zfs-kmod-0.8.0-1 I then thought I would run a scrub just to see if it found any obvious prob= lems. It started off running fine, estimating about 45-60 minutes for the whole p= rocess of scanning 446GB. (This is 4 basic SATA Ironwolf 4TB disks in raidz= 2) However it appeared to stall at 19.7%. Eventually it hit 19.71, and does ap= pear to be going up, but at this point looks like it may take a days to com= plete (currently says 3 hours but it's skewed by the initial fast progress = and going up every time I check). Gstat shows the disks at 100% doing anywhere between 10-50MB/s. (They were = hitting anywhere up to 170MB/s to start off with. Obviously this varies whe= n having to seek, but even at the rates currently seen I suspect it should = be progressing faster than zpool output shows) root@offsite:/etc # zpool status pool: offsite state: ONLINE scan: scrub in progress since Wed Jan 27 09:29:50 2021 555G scanned at 201M/s, 182G issued at 65.8M/s, 921G total 0B repaired, 19.71% done, 03:11:51 to go config: NAME STATE READ WRITE CKSUM offsite ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/data-ZGY85VKX ONLINE 0 0 0 gpt/data-ZGY88MRY ONLINE 0 0 0 gpt/data-ZGY88NZJ ONLINE 0 0 0 gpt/data-ZGY88QKF ONLINE 0 0 0 errors: No known data errors Update: I've probably spent 30+ minutes writing this email and it's reporti= ng a few more GB read but not a single digit in progress percent. scan: scrub in progress since Wed Jan 27 09:29:50 2021 559G scanned at 142M/s, 182G issued at 46.2M/s, 921G total 0B repaired, 19.71% done, 04:33:08 to go It doesn't inspire a lot of confidence. ZFS had become pretty rock solid in= FreeBSD in recent years and I have many systems running it. This should ha= ve the most efficient scrub code to date and yet is currently taking about = an hour to progress 0.01% on a new system with a fraction of the data it wi= ll hold and 0% fragmentation. As it stands at the moment, I will likely scrap this attempt and retry with= FreeBSD 12. Regards, Matt Churchyard
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?283b2b0b5df34c4f9b8c78b078a67381>