From owner-freebsd-fs@freebsd.org Wed Jan 27 09:56:01 2021 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 083464F3F0A for ; Wed, 27 Jan 2021 09:56:01 +0000 (UTC) (envelope-from matt.churchyard@userve.net) Received: from smtp-a.userve.net (smtp-outbound.userve.net [217.196.1.22]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "*.userve.net", Issuer "Sectigo RSA Domain Validation Secure Server CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DQfB80T87z3kJy for ; Wed, 27 Jan 2021 09:55:59 +0000 (UTC) (envelope-from matt.churchyard@userve.net) Received: from owa.usd-group.com (owa.usd-group.com [217.196.1.2]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp-a.userve.net (Postfix) with ESMTPS id 8E3122384C8 for ; Wed, 27 Jan 2021 09:55:51 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=userve.net; s=uk1; t=1611741351; bh=1Nh2Pn20evGeEzkD4sUwfJmtL69TH4aCTI+yqus9lSA=; h=From:To:Subject:Date; b=U6ZfL2MfR5fh0OwcUP8L9ye0iqlHyIQvmRm/Ira2OXJBDkbRAzt0xSDRlbEPsM7R5 A6JMAC+lnls3aIEFVQEEznYpjcalT2Tfhy/okfUhYAeHZdbsauaZmeLWoVTkEEOlux rQLEHrPuNkcr1p9NMnLqbCujyXmOhVJh88yuWoUQ= Received: from SERVER.ad.usd-group.com (192.168.0.1) by SERVER.ad.usd-group.com (192.168.0.1) with Microsoft SMTP Server (TLS) id 15.0.847.32; Wed, 27 Jan 2021 09:55:50 +0000 Received: from SERVER.ad.usd-group.com ([fe80::b19d:892a:6fc7:1c9]) by SERVER.ad.usd-group.com ([fe80::b19d:892a:6fc7:1c9%12]) with mapi id 15.00.0847.030; Wed, 27 Jan 2021 09:55:50 +0000 From: Matt Churchyard To: "freebsd-fs@freebsd.org" Subject: ZFS issues on 13-current snapshot Thread-Topic: ZFS issues on 13-current snapshot Thread-Index: Adb0kYPUY0NlcoqCRkqDCQZIgZuN1g== Date: Wed, 27 Jan 2021 09:55:49 +0000 Message-ID: <283b2b0b5df34c4f9b8c78b078a67381@SERVER.ad.usd-group.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [192.168.0.10] MIME-Version: 1.0 X-Rspamd-Queue-Id: 4DQfB80T87z3kJy X-Spamd-Bar: --- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=userve.net header.s=uk1 header.b=U6ZfL2Mf; dmarc=none; spf=pass (mx1.freebsd.org: domain of matt.churchyard@userve.net designates 217.196.1.22 as permitted sender) smtp.mailfrom=matt.churchyard@userve.net X-Spamd-Result: default: False [-3.50 / 15.00]; ARC_NA(0.00)[]; RBL_DBL_DONT_QUERY_IPS(0.00)[217.196.1.22:from]; R_DKIM_ALLOW(-0.20)[userve.net:s=uk1]; HAS_XOIP(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:217.196.1.0/24]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; DMARC_NA(0.00)[userve.net]; RCPT_COUNT_ONE(0.00)[1]; SPAMHAUS_ZRD(0.00)[217.196.1.22:from:127.0.2.255]; RCVD_COUNT_THREE(0.00)[4]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; DKIM_TRACE(0.00)[userve.net:+]; NEURAL_HAM_SHORT(-1.00)[-1.000]; TO_DN_EQ_ADDR_ALL(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:20652, ipnet:217.196.0.0/20, country:GB]; MAILMAN_DEST(0.00)[freebsd-fs] X-Mailman-Approved-At: Thu, 28 Jan 2021 05:38:19 +0000 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 27 Jan 2021 09:56:01 -0000 Hello, I'm testing a 13-current machine for future use as an encrypted offsite bac= kup store. As it's near release I was kind of hoping to get away with using= this 13 snapshot for a few months then switch to a RELEASE bootenv when it= comes out. However, I seem to be having a few issues. First of all I stated noticing that the USED & REFER columns weren't equal = for individual datasets. This system so far has simply received a single sn= apshot of a few datasets, and had readonly set immediately after. Some of t= hem are showing several hundred MB linked to snapshots on datasets that hav= en't been touched. I'm unable to send further snapshots without forcing a r= ollback first. Not the end of the world but this isn't right and has never = happened on previous ZFS systems. The most I've seen is a few KB because I = forgot to set readonly and went into a few directories on a dataset with at= ime=3Don. offsite 446G 6.36T 140K /offsite [...] offsite/secure/cms 359M 6.3= 6T 341M /offsite/secure/cms offsite/secure/cms@26-01-2021 17.6M = - 341M - offsite/secure/company 225G 6.3= 6T 224G /offsite/secure/company offsite/secure/company@25-01-2021 673M = - 224G - offsite/secure is an encrypted dataset using default options. zfs diff will sit for a while (on small datasets - I gave up trying to run = it on anything over a few GB) and eventually output nothing. root@offsite:/etc # uname -a FreeBSD offsite.backup 13.0-CURRENT FreeBSD 13.0-CURRENT #0 main-c255641-gf= 2b794e1e90: Thu Jan 7 06:25:26 UTC 2021 root@releng1.nyi.freebsd.org:/= usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 root@offsite:/etc # zpool version zfs-0.8.0-1 zfs-kmod-0.8.0-1 I then thought I would run a scrub just to see if it found any obvious prob= lems. It started off running fine, estimating about 45-60 minutes for the whole p= rocess of scanning 446GB. (This is 4 basic SATA Ironwolf 4TB disks in raidz= 2) However it appeared to stall at 19.7%. Eventually it hit 19.71, and does ap= pear to be going up, but at this point looks like it may take a days to com= plete (currently says 3 hours but it's skewed by the initial fast progress = and going up every time I check). Gstat shows the disks at 100% doing anywhere between 10-50MB/s. (They were = hitting anywhere up to 170MB/s to start off with. Obviously this varies whe= n having to seek, but even at the rates currently seen I suspect it should = be progressing faster than zpool output shows) root@offsite:/etc # zpool status pool: offsite state: ONLINE scan: scrub in progress since Wed Jan 27 09:29:50 2021 555G scanned at 201M/s, 182G issued at 65.8M/s, 921G total 0B repaired, 19.71% done, 03:11:51 to go config: NAME STATE READ WRITE CKSUM offsite ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 gpt/data-ZGY85VKX ONLINE 0 0 0 gpt/data-ZGY88MRY ONLINE 0 0 0 gpt/data-ZGY88NZJ ONLINE 0 0 0 gpt/data-ZGY88QKF ONLINE 0 0 0 errors: No known data errors Update: I've probably spent 30+ minutes writing this email and it's reporti= ng a few more GB read but not a single digit in progress percent. scan: scrub in progress since Wed Jan 27 09:29:50 2021 559G scanned at 142M/s, 182G issued at 46.2M/s, 921G total 0B repaired, 19.71% done, 04:33:08 to go It doesn't inspire a lot of confidence. ZFS had become pretty rock solid in= FreeBSD in recent years and I have many systems running it. This should ha= ve the most efficient scrub code to date and yet is currently taking about = an hour to progress 0.01% on a new system with a fraction of the data it wi= ll hold and 0% fragmentation. As it stands at the moment, I will likely scrap this attempt and retry with= FreeBSD 12. Regards, Matt Churchyard