From owner-freebsd-fs@freebsd.org Wed Mar 25 23:29:56 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9984F266E4F for ; Wed, 25 Mar 2020 23:29:56 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from d1.netlight.hu (d1.netlight.hu [195.56.148.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48nkq42VP2z4FHs for ; Wed, 25 Mar 2020 23:29:35 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from mail-lj1-f182.google.com (mail-lj1-f182.google.com [209.85.208.182]) by d1.netlight.hu (Postfix) with ESMTPSA id D5F761E26361 for ; Thu, 26 Mar 2020 00:29:22 +0100 (CET) Received: by mail-lj1-f182.google.com with SMTP id k21so4487503ljh.2 for ; Wed, 25 Mar 2020 16:29:22 -0700 (PDT) X-Gm-Message-State: ANhLgQ2GqzqK2Z46uecRNdS+HT4nu0B5lDfwnbgkETwXvUeTr3TxVCwH gQeHuCsti6EtQ5OiZTfSRkgnRLkCshrftymWtn0= X-Google-Smtp-Source: APiQypJr8H3G4VAGFphcaijHeSrymuzM+MC71nwUtu63a9AwUrpU4fWxCJ3CTZTM91UuQ3zp52rdhKoB/vY7JKMLiUk= X-Received: by 2002:a2e:83cf:: with SMTP id s15mr3318476ljh.36.1585178962041; Wed, 25 Mar 2020 16:29:22 -0700 (PDT) MIME-Version: 1.0 From: Attila Nagy Date: Thu, 26 Mar 2020 00:29:10 +0100 X-Gmail-Original-Message-ID: Message-ID: Subject: Importing a vdev copied zpool from file To: freebsd-fs@freebsd.org X-Rspamd-Queue-Id: 48nkq42VP2z4FHs X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of bra@fsn.hu designates 195.56.148.206 as permitted sender) smtp.mailfrom=bra@fsn.hu X-Spamd-Result: default: False [2.04 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; URIBL_BLOCKED(0.00)[illumos.org.multi.uribl.com]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+mx]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; TO_DN_NONE(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; URI_COUNT_ODD(1.00)[3]; RCVD_COUNT_THREE(0.00)[3]; DMARC_NA(0.00)[fsn.hu]; NEURAL_SPAM_LONG(0.94)[0.938,0]; NEURAL_HAM_MEDIUM(-0.16)[-0.163,0]; IP_SCORE(0.57)[ipnet: 195.56.0.0/16(2.96), asn: 5588(-0.20), country: CZ(0.09)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; ASN(0.00)[asn:5588, ipnet:195.56.0.0/16, country:CZ]; RCVD_TLS_ALL(0.00)[] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 23:29:56 -0000 Hi, I'm wondering, why this doesn't work and what could be done to make it work? # zpool status disk0 pool: disk0 state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM disk0 ONLINE 0 0 0 da0 ONLINE 0 0 0 errors: No known data errors # zpool export disk0 # dd if=/dev/da0 of=/data/da0 # zpool import -d /data disk0 pool: disk0 id: 13816971982532029716 state: UNAVAIL status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: http://illumos.org/msg/ZFS-8000-5E config: disk0 UNAVAIL insufficient replicas 10876703685892021104 UNAVAIL corrupted data From owner-freebsd-fs@freebsd.org Wed Mar 25 23:33:02 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 8443A2670DB for ; Wed, 25 Mar 2020 23:33:02 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48nkts4GZYz4GS4 for ; Wed, 25 Mar 2020 23:32:52 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.92.3 (FreeBSD)) (envelope-from ) id 1jHFVp-000MLC-KR; Wed, 25 Mar 2020 23:32:41 +0000 Date: Wed, 25 Mar 2020 23:32:41 +0000 From: Gary Palmer To: Attila Nagy Cc: freebsd-fs@freebsd.org Subject: Re: Importing a vdev copied zpool from file Message-ID: <20200325233241.GA43047@in-addr.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 48nkts4GZYz4GS4 X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-1.74 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.85)[-0.848,0]; NEURAL_HAM_LONG(-0.89)[-0.895,0]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Mar 2020 23:33:02 -0000 On Thu, Mar 26, 2020 at 12:29:10AM +0100, Attila Nagy wrote: > Hi, > > I'm wondering, why this doesn't work and what could be done to make it work? > > # zpool status disk0 > pool: disk0 > state: ONLINE > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > disk0 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > > errors: No known data errors > # zpool export disk0 > # dd if=/dev/da0 of=/data/da0 > # zpool import -d /data disk0 > pool: disk0 > id: 13816971982532029716 > state: UNAVAIL > status: One or more devices contains corrupted data. > action: The pool cannot be imported due to damaged devices or data. > see: http://illumos.org/msg/ZFS-8000-5E > config: > > disk0 UNAVAIL insufficient replicas > 10876703685892021104 UNAVAIL corrupted data Use mdconfig(8) or similar to turn the file into a device and then it should work Regards, Gary From owner-freebsd-fs@freebsd.org Thu Mar 26 00:27:40 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 9598F268E29 for ; Thu, 26 Mar 2020 00:27:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from CAN01-QB1-obe.outbound.protection.outlook.com (mail-eopbgr660056.outbound.protection.outlook.com [40.107.66.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mail.protection.outlook.com", Issuer "GlobalSign Organization Validation CA - SHA256 - G3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48nm5j1bgRz4bvY for ; Thu, 26 Mar 2020 00:27:19 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Kt69Q6NdC4imo36daRJIGYbmbx8/L6A9RATtoMdSY6q+ocJuUZo8MDG07KKX9iUKkUybZJnY9EvLJj3mGn59C6BmEgupJQbYAHjXS5oooT5KMldy4KN8ZkOXfeGAWDMCXoC3toS8wPmGrbgMuDTTMOtLVkhFuagEsbXZD2mcwnH/Wi/pu3ixrUgmKL7D9U4XjBKt5q87XxSLiXZawfrvcIvn6Jibc6j4xRhfdehqjMNRuthGAYQTbuJ5DrieCnAfNFFMU9XRSkTSPGeYCzO1B3dqENlF0/s6OSMefhXe/v22YfI+8H+E8DGStpzIzmmtErgkGWv53moXqiZvAzsLoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pCtCMNoN47DfIHCzFbayQwBsPb61k51zgjfsCuwaVK4=; b=Spz0VIplHJq2SpttVo4Kysonphns0pM+5T+SCpKSKji6/V2iYMpoNEp9pIiVeoCKtYZzkM7u0Q6KMii8z3OKAjiyeSKKpEulPcRbKYJynitxWNMdlzk0OsPEG/jKDt7RB8ysOr25j8Ix24nhqOPnBZgt7ENEx8+IBMIXO9lLTFNJxhxLvgCi4ixYDd9mfWrIgfZgn+7B1+9cV6sILcRw3fx6+2JywBr8idoKunMvusLcz7LmdYIXdg+dZMetXfRupEBCcGO/F3xdPYR1v1B1EjxCWNpxLtxq4WuF8stsqnUgp5gBvwxmx3UtSsgG57MBkXLagkX39ZmrxqKWtzhbaQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=uoguelph.ca; dmarc=pass action=none header.from=uoguelph.ca; dkim=pass header.d=uoguelph.ca; arc=none Received: from QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM (52.132.86.26) by QB1PR01MB3346.CANPRD01.PROD.OUTLOOK.COM (52.132.89.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2835.18; Thu, 26 Mar 2020 00:27:10 +0000 Received: from QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM ([fe80::ed8c:7662:79ba:5f9f]) by QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM ([fe80::ed8c:7662:79ba:5f9f%5]) with mapi id 15.20.2835.023; Thu, 26 Mar 2020 00:27:10 +0000 From: Rick Macklem To: Peter Eriksson , FreeBSD Filesystems Subject: Re: ZFS/NFS hickups and some tools to monitor stuff... Thread-Topic: ZFS/NFS hickups and some tools to monitor stuff... Thread-Index: AQHV+hZvwap6TBZNNkqxEDBzyftSQKhaEQa+ Date: Thu, 26 Mar 2020 00:27:10 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 52a6224f-5e8c-4b88-68ae-08d7d11c6c7f x-ms-traffictypediagnostic: QB1PR01MB3346: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:6430; x-forefront-prvs: 0354B4BED2 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(346002)(396003)(136003)(376002)(366004)(39860400002)(55016002)(33656002)(66476007)(76116006)(66446008)(66946007)(64756008)(186003)(52536014)(9686003)(5660300002)(66556008)(86362001)(110136005)(478600001)(71200400001)(966005)(81156014)(8676002)(7696005)(6506007)(8936002)(316002)(2906002)(296002)(81166006)(786003); DIR:OUT; SFP:1101; SCL:1; SRVR:QB1PR01MB3346; H:QB1PR01MB3649.CANPRD01.PROD.OUTLOOK.COM; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; received-spf: None (protection.outlook.com: uoguelph.ca does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: A3YEem++dNsgfhU9oUVd2HxXJb9Nar2DFcqMYhMI6Rq14SJzIE+fVqwZOr9bxVWFMg4rhl4IZ2kQeVrcYHcnD9DSaWO24fD8H+sMdEKRpYuRx1bkC7D7GeT8tw04vVY0TA40eyxD4WpVUnUE7hO3Pvt0xkDaY1vAb9Dpkayl1i43iYWrzmk903e7dEv6630a/uM/Tsc9raT2z7XhVCz/bHmQB1hb88jv7+hwLH8M1WPzKHo9qrqqd1+lxpe/hoWsbqWVI1/nrn+OYeOhn3s9aqlSWLCFVlJY5c/w1srVr/qy55LjvbzNDVuv1CsWwtrb0W05jMMg8uPvvmm/TqkRBhT3t3YfVc1Dyam+uRt7DKjID4BnIsHTAxs30sHro1khqoRQS0kEnFbbJdvxd7Sf1xWwcDIXaRGx7ko5Cj7ZtaSB95XG8lg4I5ouGQ10erEWwu57nw9ty7D+JKzVYCa4dVy+7rBV1+DzQiGDnfO4BtyGOdin4a7EkQdF2PmQ8LzEmmBbkk2mtenl5kkC1NL5bg== x-ms-exchange-antispam-messagedata: NfETnS3qy1+RhNrNi23ul+1w8nMjs9AKLRgJmhlLzlFw9sXMX5TmWDudjopRbC9HDOUNoWOvRvz68dkMJJIC7sRgOwTjJAftyfw0G1VWLDkDVGBAfl9UXGuPwKSbW+gSj5s70GqMJFTJZCY3FEHgcblougRY+mAxHZ5BuAhIN739VMV0qoIBQLUzeSxRQdCV9IFTERm2vw9+1S6gmlBgTQ== x-ms-exchange-transport-forked: True Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: uoguelph.ca X-MS-Exchange-CrossTenant-Network-Message-Id: 52a6224f-5e8c-4b88-68ae-08d7d11c6c7f X-MS-Exchange-CrossTenant-originalarrivaltime: 26 Mar 2020 00:27:10.5213 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: be62a12b-2cad-49a1-a5fa-85f4f3156a7d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: imYZt3OXLCNNW6ooM/wo33HhqH4Di0q2bn8AyD3VX5BoImoeinHc8jBNKNfGkB6AjD90GaD6b3UXX7VkLwPgiA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: QB1PR01MB3346 X-Rspamd-Queue-Id: 48nm5j1bgRz4bvY X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of rmacklem@uoguelph.ca designates 40.107.66.56 as permitted sender) smtp.mailfrom=rmacklem@uoguelph.ca X-Spamd-Result: default: False [-4.68 / 15.00]; RCVD_TLS_LAST(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:40.107.0.0/16]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[uoguelph.ca]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[56.66.107.40.list.dnswl.org : 127.0.3.0]; IP_SCORE(-1.38)[ipnet: 40.64.0.0/10(-3.75), asn: 8075(-3.13), country: US(-0.05)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:8075, ipnet:40.64.0.0/10, country:US]; ARC_ALLOW(-1.00)[i=1] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2020 00:27:40 -0000 Peter Eriksson wrote:=0A= >The last couple of weeks I=92ve been fighting with a severe case of NFS us= ers >complaining about slow response times from our (5) FreeBSD 11.3-RELEAS= E-p6 file >servers. Now even though our SMB (Windows) users (thankfully sin= ce they are like >500 per server vs 50 NFS users) didn=92t see the same slo= wdown (or atleast didn=92t >complain about it) the root cause is probably Z= FS-related.=0A= >=0A= >We=92ve identified a number of cases where some ZFS operation can cause se= vere >slowdown of NFS operations, and I=92ve been trying to figure our what= is the cause and >ways to mitigate the problem=85=0A= >=0A= >Some operations that have caused issues:=0A= >=0A= >1. Resilver (basically made NFS service useless during the week it took=85= ) with >response time for NFS operations regularity up to 10 seconds or mor= e (vs the normal >1-10ms)=0A= >=0A= >2. Snapshot recursive deferred destruction (=93zfs destroy -dr DATA@snapna= m=94). >Especially bad together with filesystems at or near quota.=0A= >=0A= >3. Rsync cloning of data into the servers. Response times up to 15 minutes= was seen=85 >Yes, 15 minutes to do a mkdir(=93test-dir=94). Possibly in co= njunction with #1 above=85.=0A= >=0A= >Previously #1 and #2 hasn=92t caused that much problems, and #3 definitely= . >Something has changed the last half year or so but so far I haven=92t be= en able to >figure it out.=0A= >=0A= [stuff snipped]=0A= >It would be interresting to see if others too are seeing ZFS and/or NFS sl= owdowns >during heavy writing operations (resilver, snapshot-destroy, rsync= )=85=0A= >=0A= >=0A= >Our DATA pools are basically 2xRAIDZ2(4+2) of 10TB 7200rpm disks + 400GB S= SD:s >for ZIL + 400GB SSDs for L2ARC. 256GB RAM, configured with ARC-MAX se= t to 64GB >(used to be 128GB but we ran into out-of-memory with the 500+ Sa= mba smbd >daemons that would compete for the RAM=85)=0A= Since no one else has commented, I'll mention a few things.=0A= First the disclaimer...I never use ZFS and know nothing about SSDs, so a lo= t of=0A= what I'll be saying comes from discussions I've seen by others.=0A= =0A= Now, I see you use a mirrored pair of SSDs for ZIL logging devices.=0A= You don't mention what NFS client(s) are mounting the server, so I'm going= =0A= to assume they are Linux systems.=0A= - I don't know how the client decides, but I have seen NFS Linux packet tra= ces=0A= where the client does a lot of 4K writes with FILE_STABLE. FILE_STABLE me= ans=0A= that the data and metadata related to the write must be on stable storage= =0A= before the RPC replies NFS_OK.=0A= --> This means the data and metadata changes must be written to the ZIL.= =0A= As such, really slow response when a ZIL log device is being resilvered isn= 't=0A= surprising to me.=0A= For the other cases, there is a heavy write load, which "might" also be hit= ting=0A= the ZIL log hard.=0A= =0A= What can you do about this?=0A= - You can live dangerously and set "sync=3Ddisabled" for ZFS. This means th= at=0A= the writes will reply NFS_OK without needing to write to the ZIL log fir= st.=0A= (I don't know enough about ZFS to know whether or not this makes the ZIL= =0A= log no longer get used?)=0A= - Why do I say "live dangerously"? Because data writes could get lost whe= n=0A= the NFS server reboots and the NFS client would think the data was writ= ten=0A= just fine.=0A= =0A= I'm the last guy to discuss SSDs, but they definitely have weird performanc= e=0A= for writing and can get very slow for writing, especially when they get nea= rly=0A= full.=0A= --> I have heard others recommend limiting the size of your ZIL to at most= =0A= 1/2 of the SSD's capacity, assuming the SSD is dedicated to the ZIL= =0A= and nothing else. (I have no idea if you already do this?)=0A= =0A= Hopefully others will have further comments, rick=0A= =0A= =0A= We=92ve tried it with and without L2ARC, and replaced the SSD:s. Disabled T= RIM. Not much difference. Tried trimming various sysctls but no difference = seen so far. Annoying problem this=85=0A= =0A= - Peter=0A= =0A= _______________________________________________=0A= freebsd-fs@freebsd.org mailing list=0A= https://lists.freebsd.org/mailman/listinfo/freebsd-fs=0A= To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"=0A= From owner-freebsd-fs@freebsd.org Thu Mar 26 11:41:39 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 5611B27C850 for ; Thu, 26 Mar 2020 11:41:39 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48p33d20c2z3PNc for ; Thu, 26 Mar 2020 11:41:33 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.92.3 (FreeBSD)) (envelope-from ) id 1jHQt1-000PZK-6v; Thu, 26 Mar 2020 11:41:23 +0000 Date: Thu, 26 Mar 2020 11:41:23 +0000 From: Gary Palmer To: Attila Nagy Cc: freebsd-fs@freebsd.org Subject: Re: Importing a vdev copied zpool from file Message-ID: <20200326114123.GA98069@in-addr.com> References: <20200325233241.GA43047@in-addr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 48p33d20c2z3PNc X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-1.68 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.85)[-0.847,0]; NEURAL_HAM_LONG(-0.83)[-0.834,0]; TAGGED_RCPT(0.00)[]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2020 11:41:39 -0000 On Thu, Mar 26, 2020 at 10:21:23AM +0100, Attila Nagy wrote: > On Thu, 26 Mar 2020 at 00:32, Gary Palmer wrote: > > > > > Use mdconfig(8) or similar to turn the file into a device and then it > > should work > > > > Sure, that (also, iscsi, ggate etc, but mdconfig is the easiest amongst > them if the file is locally available) works, thanks. > I'm just wondering why it doesn't with the zpool interface, which is much > more convenient to use. > Maybe because the whole disk schema (ZFS arranges data differently on a > block device than in a file)? Disks and files have different interfaces in the kernel. The fact that mdconfig(8) exists shows that it is possible to fake the kernel into thinking that a flat file is a device. Thus it should be possible to teach that to ZFS also, or any other filesystem such as UFS. However I'm not sure the extra complexity in the kernel to do that for each filesystem when a generic, filesytem independant, interface already exists in mdconfig(8). Regards, Gary From owner-freebsd-fs@freebsd.org Thu Mar 26 12:59:44 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 14B6227EBE9 for ; Thu, 26 Mar 2020 12:59:44 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48p4nX2DG0z4PM9 for ; Thu, 26 Mar 2020 12:59:28 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:39c8:cae4:2881:8913] ([IPv6:2607:f3e0:0:4:39c8:cae4:2881:8913]) by pyroxene2a.sentex.ca (8.15.2/8.15.2) with ESMTPS id 02QCxGPa007128 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Thu, 26 Mar 2020 08:59:16 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Importing a vdev copied zpool from file To: Attila Nagy , freebsd-fs@freebsd.org References: From: mike tancsa Autocrypt: addr=mike@sentex.net; keydata= mQENBFywzOMBCACoNFpwi5MeyEREiCeHtbm6pZJI/HnO+wXdCAWtZkS49weOoVyUj5BEXRZP xflV2ib2hflX4nXqhenaNiia4iaZ9ft3I1ebd7GEbGnsWCvAnob5MvDZyStDAuRxPJK1ya/s +6rOvr+eQiXYNVvfBhrCfrtR/esSkitBGxhUkBjOti8QwzD71JVF5YaOjBAs7jZUKyLGj0kW yDg4jUndudWU7G2yc9GwpHJ9aRSUN8e/mWdIogK0v+QBHfv/dsI6zVB7YuxCC9Fx8WPwfhDH VZC4kdYCQWKXrm7yb4TiVdBh5kgvlO9q3js1yYdfR1x8mjK2bH2RSv4bV3zkNmsDCIxjABEB AAG0HW1pa2UgdGFuY3NhIDxtaWtlQHNlbnRleC5uZXQ+iQFUBBMBCAA+FiEEmuvCXT0aY6hs 4SbWeVOEFl5WrMgFAlywzOYCGwMFCQHhM4AFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQ eVOEFl5WrMhnPAf7Bf+ola0V9t4i8rwCMGvzkssGaxY/5zNSZO9BgSgfN0WzgmBEOy/3R4km Yn5KH94NltJYAAE5hqkFmAwK6psOqAR9cxHrRfU+gV2KO8pCDc6K/htkQcd/mclJYpCHp6Eq EVJOiAxcNaYuHZkeMdXDuvvI5Rk82VHk84BGgxIqIrhLlkguoPbXOOa+8c/Mpb1sRAGZEOuX EzKNC49+GS9gKW6ISbanyPsGEcFyP7GKMzcHBPf3cPrewZQZ6gBoNscasL6IJeAQDqzQAxbU GjO0qBSMRgnLXK7+DJlxrYdHGXqNbV6AYsmHJ6c2WWWiuRviFBqXinlgJ2FnYebZPAfWibkB DQRcsMzkAQgA1Dpo/xWS66MaOJLwA28sKNMwkEk1Yjs+okOXDOu1F+0qvgE8sVmrOOPvvWr4 axtKRSG1t2QUiZ/ZkW/x/+t0nrM39EANV1VncuQZ1ceIiwTJFqGZQ8kb0+BNkwuNVFHRgXm1 qzAJweEtRdsCMohB+H7BL5LGCVG5JaU0lqFU9pFP40HxEbyzxjsZgSE8LwkI6wcu0BLv6K6c Lm0EiHPOl5G8kgRi38PS7/6s3R8QDsEtbGsYy6O82k3zSLIjuDBwA9GRaeigGppTxzAHVjf5 o9KKu4O7gC2KKVHPegbXS+GK7DU0fjzX57H5bZ6komE5eY4p3oWT/CwVPSGfPs8jOwARAQAB iQE8BBgBCAAmFiEEmuvCXT0aY6hs4SbWeVOEFl5WrMgFAlywzOQCGwwFCQHhM4AACgkQeVOE Fl5WrMhmjQf/dBCjAVn1J0GzSsHiLvSAQz1cchbdy8LD0Tnpzjgp5KLU7sNojbI8vqt4yKAi cayI88j8+xxNXPMWM4pHELuUuVHS5XTpHa/wwulUtI5w/zyKlUDsIvqTPZLUEwH7DfNBueVM WyNaIjV2kxSmM8rNMC+RkgyfbjGLCkmWsMRVuLIUYpl5D9WHmenUbiErlKU2KvEEXEg/aLKq 3m/AdM9RAYsP9O4l+sAZEfyYoNJzDhTZMzn/9Q0uFPLK9smDQh4WBTFaApveVJPHRKmHPoNF Xxj+yScYdQ4SKH34WnhNSELvnZQ3ulH5tpASmm0w+GxfZqSc8+QCwoKtBRDUxoE56A== Message-ID: <616f6969-b137-7db6-894f-f7612e67abcd@sentex.net> Date: Thu, 26 Mar 2020 08:59:17 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 48p4nX2DG0z4PM9 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=pass (mx1.freebsd.org: domain of mike@sentex.net designates 2607:f3e0:0:3::19 as permitted sender) smtp.mailfrom=mike@sentex.net X-Spamd-Result: default: False [-2.71 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-0.999,0]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; R_SPF_ALLOW(-0.20)[+ptr]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(0.00)[sentex.net]; HFILTER_HELO_IP_A(1.00)[pyroxene2a.sentex.ca]; HFILTER_HELO_NORES_A_OR_MX(0.30)[pyroxene2a.sentex.ca]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; IP_SCORE(-1.71)[ipnet: 2607:f3e0::/32(-4.92), asn: 11647(-3.53), country: CA(-0.09)]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:11647, ipnet:2607:f3e0::/32, country:CA]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2020 12:59:44 -0000 On 3/25/2020 7:29 PM, Attila Nagy wrote: > Hi, > > I'm wondering, why this doesn't work and what could be done to make it work? > > # dd if=/dev/da0 of=/data/da0 What if you add the step mdconfig *-o async* -t vnode -f /data/da0 and then try the import ? Note, the async option adds a pretty big speed increase     ---Mike From owner-freebsd-fs@freebsd.org Thu Mar 26 14:07:58 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 0B9E72A10F2 for ; Thu, 26 Mar 2020 14:07:57 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from mail.in-addr.com (mail.in-addr.com [IPv6:2a01:4f8:191:61e8::2525:2525]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48p6JV3FmRz3Lrs for ; Thu, 26 Mar 2020 14:07:53 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from gjp by mail.in-addr.com with local (Exim 4.92.3 (FreeBSD)) (envelope-from ) id 1jHTAZ-000PmJ-US; Thu, 26 Mar 2020 14:07:40 +0000 Date: Thu, 26 Mar 2020 14:07:39 +0000 From: Gary Palmer To: mike tancsa Cc: Attila Nagy , freebsd-fs@freebsd.org Subject: Re: Importing a vdev copied zpool from file Message-ID: <20200326140739.GB98069@in-addr.com> References: <616f6969-b137-7db6-894f-f7612e67abcd@sentex.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <616f6969-b137-7db6-894f-f7612e67abcd@sentex.net> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on mail.in-addr.com); SAEximRunCond expanded to false X-Rspamd-Queue-Id: 48p6JV3FmRz3Lrs X-Spamd-Bar: - Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-1.83 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.92)[-0.916,0]; ASN(0.00)[asn:24940, ipnet:2a01:4f8::/29, country:DE]; NEURAL_HAM_LONG(-0.91)[-0.914,0] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2020 14:07:58 -0000 On Thu, Mar 26, 2020 at 08:59:17AM -0400, mike tancsa wrote: > On 3/25/2020 7:29 PM, Attila Nagy wrote: > > Hi, > > > > I'm wondering, why this doesn't work and what could be done to make it work? > > > > # dd if=/dev/da0 of=/data/da0 > > What if you add the step > > mdconfig *-o async*??-t vnode -f /data/da0 > > and then try the import ? Note, the async option adds a pretty big speed > increase At least on 11.3, that option has this caveat in the man page: -o [no]option Set or reset options. [no]async For vnode backed devices: avoid IO_SYNC for increased performance but at the risk of deadlocking the entire kernel. Not sure if the risk of deadlocking the kernel is still there or if it's worth any potential speedup Regards, Gary From owner-freebsd-fs@freebsd.org Thu Mar 26 14:17:34 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 573742A1334 for ; Thu, 26 Mar 2020 14:17:34 +0000 (UTC) (envelope-from mike@sentex.net) Received: from pyroxene2a.sentex.ca (pyroxene19.sentex.ca [IPv6:2607:f3e0:0:3::19]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "pyroxene.sentex.ca", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 48p6WZ2tlqz3PtM; Thu, 26 Mar 2020 14:17:30 +0000 (UTC) (envelope-from mike@sentex.net) Received: from [IPv6:2607:f3e0:0:4:39c8:cae4:2881:8913] ([IPv6:2607:f3e0:0:4:39c8:cae4:2881:8913]) by pyroxene2a.sentex.ca (8.15.2/8.15.2) with ESMTPS id 02QEHJQv012519 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NO); Thu, 26 Mar 2020 10:17:19 -0400 (EDT) (envelope-from mike@sentex.net) Subject: Re: Importing a vdev copied zpool from file To: Gary Palmer Cc: Attila Nagy , freebsd-fs@freebsd.org References: <616f6969-b137-7db6-894f-f7612e67abcd@sentex.net> <20200326140739.GB98069@in-addr.com> From: mike tancsa Autocrypt: addr=mike@sentex.net; keydata= mQENBFywzOMBCACoNFpwi5MeyEREiCeHtbm6pZJI/HnO+wXdCAWtZkS49weOoVyUj5BEXRZP xflV2ib2hflX4nXqhenaNiia4iaZ9ft3I1ebd7GEbGnsWCvAnob5MvDZyStDAuRxPJK1ya/s +6rOvr+eQiXYNVvfBhrCfrtR/esSkitBGxhUkBjOti8QwzD71JVF5YaOjBAs7jZUKyLGj0kW yDg4jUndudWU7G2yc9GwpHJ9aRSUN8e/mWdIogK0v+QBHfv/dsI6zVB7YuxCC9Fx8WPwfhDH VZC4kdYCQWKXrm7yb4TiVdBh5kgvlO9q3js1yYdfR1x8mjK2bH2RSv4bV3zkNmsDCIxjABEB AAG0HW1pa2UgdGFuY3NhIDxtaWtlQHNlbnRleC5uZXQ+iQFUBBMBCAA+FiEEmuvCXT0aY6hs 4SbWeVOEFl5WrMgFAlywzOYCGwMFCQHhM4AFCwkIBwIGFQoJCAsCBBYCAwECHgECF4AACgkQ eVOEFl5WrMhnPAf7Bf+ola0V9t4i8rwCMGvzkssGaxY/5zNSZO9BgSgfN0WzgmBEOy/3R4km Yn5KH94NltJYAAE5hqkFmAwK6psOqAR9cxHrRfU+gV2KO8pCDc6K/htkQcd/mclJYpCHp6Eq EVJOiAxcNaYuHZkeMdXDuvvI5Rk82VHk84BGgxIqIrhLlkguoPbXOOa+8c/Mpb1sRAGZEOuX EzKNC49+GS9gKW6ISbanyPsGEcFyP7GKMzcHBPf3cPrewZQZ6gBoNscasL6IJeAQDqzQAxbU GjO0qBSMRgnLXK7+DJlxrYdHGXqNbV6AYsmHJ6c2WWWiuRviFBqXinlgJ2FnYebZPAfWibkB DQRcsMzkAQgA1Dpo/xWS66MaOJLwA28sKNMwkEk1Yjs+okOXDOu1F+0qvgE8sVmrOOPvvWr4 axtKRSG1t2QUiZ/ZkW/x/+t0nrM39EANV1VncuQZ1ceIiwTJFqGZQ8kb0+BNkwuNVFHRgXm1 qzAJweEtRdsCMohB+H7BL5LGCVG5JaU0lqFU9pFP40HxEbyzxjsZgSE8LwkI6wcu0BLv6K6c Lm0EiHPOl5G8kgRi38PS7/6s3R8QDsEtbGsYy6O82k3zSLIjuDBwA9GRaeigGppTxzAHVjf5 o9KKu4O7gC2KKVHPegbXS+GK7DU0fjzX57H5bZ6komE5eY4p3oWT/CwVPSGfPs8jOwARAQAB iQE8BBgBCAAmFiEEmuvCXT0aY6hs4SbWeVOEFl5WrMgFAlywzOQCGwwFCQHhM4AACgkQeVOE Fl5WrMhmjQf/dBCjAVn1J0GzSsHiLvSAQz1cchbdy8LD0Tnpzjgp5KLU7sNojbI8vqt4yKAi cayI88j8+xxNXPMWM4pHELuUuVHS5XTpHa/wwulUtI5w/zyKlUDsIvqTPZLUEwH7DfNBueVM WyNaIjV2kxSmM8rNMC+RkgyfbjGLCkmWsMRVuLIUYpl5D9WHmenUbiErlKU2KvEEXEg/aLKq 3m/AdM9RAYsP9O4l+sAZEfyYoNJzDhTZMzn/9Q0uFPLK9smDQh4WBTFaApveVJPHRKmHPoNF Xxj+yScYdQ4SKH34WnhNSELvnZQ3ulH5tpASmm0w+GxfZqSc8+QCwoKtBRDUxoE56A== Message-ID: Date: Thu, 26 Mar 2020 10:17:20 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20200326140739.GB98069@in-addr.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Content-Language: en-US X-Rspamd-Queue-Id: 48p6WZ2tlqz3PtM X-Spamd-Bar: ----- Authentication-Results: mx1.freebsd.org; none X-Spamd-Result: default: False [-5.99 / 15.00]; NEURAL_HAM_MEDIUM(-0.99)[-0.992,0]; NEURAL_HAM_LONG(-1.00)[-0.999,0]; REPLY(-4.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 26 Mar 2020 14:17:34 -0000 On 3/26/2020 10:07 AM, Gary Palmer wrote: > What if you add the step >> mdconfig *-o async*??-t vnode -f /data/da0 >> >> and then try the import ? Note, the async option adds a pretty big speed >> increase > At least on 11.3, that option has this caveat in the man page: > > -o [no]option > Set or reset options. > > [no]async > For vnode backed devices: avoid IO_SYNC for increased > performance but at the risk of deadlocking the entire > kernel. > > Not sure if the risk of deadlocking the kernel is still there or if it's > worth any potential speedup Its quite a difference in speed. I have been using it for a while on RELENG_12 without any deadlocks while virtualizing servers.  On large imports, it would give close to a 50% speedup     ---Mike From owner-freebsd-fs@freebsd.org Fri Mar 27 07:50:24 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 266A726D3BD for ; Fri, 27 Mar 2020 07:50:24 +0000 (UTC) (envelope-from artem@artem.ru) Received: from smtp43.i.mail.ru (smtp43.i.mail.ru [94.100.177.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48pYtC5Lrdz4N1T for ; Fri, 27 Mar 2020 07:50:11 +0000 (UTC) (envelope-from artem@artem.ru) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Type:MIME-Version:Date:Message-ID:Subject:From:To; bh=84R3atlSjo+G3BENcR/znLpiWYPthwcORD8Vo+0i8XE=; b=ZTT43yrYGMADUNC3dezxQ/H97K6Vxcz7iqN7ASdPfJHQZ5jr5YdE95PaPULkSEasw20+C+pi91QIuBQwG8VWsbZzSU7tiEc+MxsmLqvH+VL8TL5lgSA4jKeg/BHuehPzp35lvIINjNJjf6RE4+Z3ZcFOnASUxc2/3Wpj1X51g4E=; Received: by smtp43.i.mail.ru with esmtpa (envelope-from ) id 1jHjkd-00075I-9u for freebsd-fs@freebsd.org; Fri, 27 Mar 2020 10:49:59 +0300 To: freebsd-fs@freebsd.org From: Artem Kuchin Subject: Recovering bad sectors and smartctl no lba in error report Message-ID: <345b7285-958b-ef52-70a9-084872cf7409@artem.ru> Date: Fri, 27 Mar 2020 10:49:58 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 Content-Language: ru X-7564579A: 78E4E2B564C1792B X-77F55803: 0A44E481635329DB0E1AA8A03B392317D32E5E48865217365060145B739F5F5C7141CD936E95A8CDF688BCB05C26794D4ADC2FDA36790783433C5A1734A2C6D42332354A992B770E361B1DCE9D89B03C X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE7E9A0F80F179600C6EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F79006375083DEA2ECAF87758638F802B75D45FF5571747095F342E8C7A0BC55FA0FE5FC98409C0A6B956C01DD663DE13342C81021B03C32D0D2A705389733CBF5DBD5E913377AFFFEAFD269A417C69337E82CC2CC7F00164DA146DAFE8445B8C89999725571747095F342E8C26CFBAC0749D213D2E47CDBA5A9658359CC434672EE6371117882F4460429728AD0CFFFB425014E40A5AABA2AD371193AA81AA40904B5D9A18204E546F3947CE6D7C9137AE18D269735652A29929C6C4AD6D5ED66289B52E1A3F18E62937ED6302FCEF25BFAB345725E5C173C3A84C3B9E0336EE4E43422BA3038C0950A5D36B5C8C57E37DE458B0B4866841D68ED3522CA9DD8327EE4930A3850AC1BE2E735BA6625F88748EAEFC4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F05F538519369F3743B503F486389A921A5CC5B56E945C8DA X-D57D3AED: Y8kq8+OzVozcFQziTi/Zi1xwo7H2ZNxGP5qz8aO2mjTJzjHGC4ogvVuzB3zfVUBtENeZ6b5av1fnCBE34JUDkWdM6QxE+Ga5d8voMtmXfSqqsJQZGzcrYTqYQ0PEDyTA X-Mailru-Sender: 0E9E14D9EC491FBA79C5613A73A5E7B22C4809E2DB4F53DAAB3BF0D9E1956B7259B338C6ADB0E14E8A4382C47DA47812C77752E0C033A69E376A1339FE8876DF1FC4F5A70058821069EB1F849E6DBC830DA7A0AF5A3A8387 X-Mras: Ok X-Rspamd-Queue-Id: 48pYtC5Lrdz4N1T X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=pass header.d=mail.ru header.s=mail2 header.b=ZTT43yrY; dmarc=none; spf=none (mx1.freebsd.org: domain of artem@artem.ru has no SPF policy when checking 94.100.177.103) smtp.mailfrom=artem@artem.ru X-Spamd-Result: default: False [-0.37 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.50)[-0.501,0]; R_DKIM_ALLOW(-0.20)[mail.ru:s=mail2]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.53)[-0.530,0]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[artem.ru]; URI_COUNT_ODD(1.00)[3]; RCPT_COUNT_ONE(0.00)[1]; DKIM_TRACE(0.00)[mail.ru:+]; R_SPF_NA(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+,1:+,2:~]; IP_SCORE(0.06)[ipnet: 94.100.176.0/20(0.06), asn: 47764(0.24), country: RU(0.01)]; ASN(0.00)[asn:47764, ipnet:94.100.176.0/20, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[103.177.100.94.list.dnswl.org : 127.0.5.1] Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 07:50:24 -0000 Hello! One of my RAID 1 disks went a little 'woohoo' and i got at least one read error on swap partition. I've disabled swap alltogether (and it actually made everything better) and have run smartctl test. here is the output: https://artem.ru/ada2.txt I will describe my logic step by step and closer to the end i will have  questions. You can skip to the end to the QUESTIONS sections :) What's strange is that 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 8 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 So, seectors are in read error state, but off line uncrorrectable is 0. Okay, now the test results SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 46183 - # 2 Extended offline Completed: read failure 20% 46181 - # 3 Short offline Completed without error 00% 46170 - As you see - NO LBAsecrtor is sepcified. From the log: rror 5 occurred at disk power-on lifetime: 46151 hours (1922 days + 23 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER ST SC SN CL CH DH -- -- -- -- -- -- -- 40 51 a0 08 de 3e 0b Error: UNC at LBA = 0x0b3ede08 = 188669448 Commands leading to the command that caused the error were: CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name -- -- -- -- -- -- -- -- ---------------- -------------------- 60 80 48 e8 84 4e 40 00 10:43:18.103 READ FPDMA QUEUED 61 08 40 48 04 21 40 00 10:43:18.103 WRITE FPDMA QUEUED 60 40 38 e8 94 32 40 00 10:43:18.103 READ FPDMA QUEUED 61 08 30 20 b9 ef 40 00 10:43:18.103 WRITE FPDMA QUEUED 61 30 28 68 22 03 40 00 10:43:18.103 WRITE FPDMA QUEUED And 188669448 is the only LBA mentioned in the log. So, my logic is the following: This HDD has "Sector Sizes: 512 bytes logical, 4096 bytes physical" So, LBA/(4096/512) = physical sector number So, what i need to write the whole physical sector (8 lba) to trigger sector relocation. Like doing simple: |dd if=/dev/zero of=/dev/ada2 bs=4096 count=1 seek=CALCULATED_VALUE then do fsync to really make it write to hdd However, i need to know what file is damaged. So, now to the questions/ | QUESTIONS: 1) Why smart report does not show LBA in the test result table? 2) Is my logic correct? 3) How do i find what file is using LBA/SECTOR ? 4) I se that there are 9 pending sectors. Is it physical sectors or LBA? If LBA then okay, it matches one physical sector, but if it is physical sector tben how to get a list of them? Artem From owner-freebsd-fs@freebsd.org Fri Mar 27 08:54:46 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 253D226ED54 for ; Fri, 27 Mar 2020 08:54:46 +0000 (UTC) (envelope-from tarkhil@webmail.sub.ru) Received: from mail.sub.ru (mail.sub.ru [88.212.205.2]) by mx1.freebsd.org (Postfix) with SMTP id 48pbJT0f7Tz3J3q for ; Fri, 27 Mar 2020 08:54:31 +0000 (UTC) (envelope-from tarkhil@webmail.sub.ru) Received: (qmail 41353 invoked from network); 27 Mar 2020 11:54:16 +0300 Received: from 109-252-109-209.nat.spd-mgts.ru (109-252-109-209.nat.spd-mgts.ru [109.252.109.209]) by mail.sub.ru ([88.212.205.2]) with ESMTP via TCP; 31 Dec 1969 23:59:59 -0000 To: freebsd-fs@freebsd.org From: =?UTF-8?B?0JDQu9C10LrRgdCw0L3QtNGAINCf0L7QstC+0LvQvtGG0LrQuNC5?= Subject: How do I recover a crashed pool? Message-ID: Date: Fri, 27 Mar 2020 11:54:15 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Antivirus: Avast (VPS 200325-0, 25.03.2020), Outbound message X-Antivirus-Status: Clean X-Rspamd-Queue-Id: 48pbJT0f7Tz3J3q X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of tarkhil@webmail.sub.ru has no SPF policy when checking 88.212.205.2) smtp.mailfrom=tarkhil@webmail.sub.ru X-Spamd-Result: default: False [4.64 / 15.00]; ARC_NA(0.00)[]; RECEIVED_SPAMHAUS_PBL(0.00)[209.109.252.109.khpj7ygk5idzvmvt5x4ziurxhy.zen.dq.spamhaus.net : 127.0.0.11]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; SURBL_MULTI_FAIL(0.00)[query timed out]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_SPAM_MEDIUM(1.00)[0.995,0]; MIME_TRACE(0.00)[0:+]; ZRD_FAIL(0.00)[query timed out]; NEURAL_SPAM_LONG(1.00)[0.999,0]; DMARC_NA(0.00)[sub.ru]; R_SPF_NA(0.00)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; SUBJECT_ENDS_QUESTION(1.00)[]; ASN(0.00)[asn:39134, ipnet:88.212.192.0/19, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(0.64)[ip: (1.68), ipnet: 88.212.192.0/19(0.84), asn: 39134(0.67), country: RU(0.01)]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 08:54:46 -0000 Hello I've experienced a zpool crash after hard reset   pool: fast  state: FAULTED status: The pool metadata is corrupted and the pool cannot be opened. action: Destroy and re-create the pool from         a backup source.    see: http://illumos.org/msg/ZFS-8000-72   scan: scrub repaired 0 in 0 days 00:07:28 with 0 errors on Fri Mar  6 04:27:10 2020 config:         NAME        STATE     READ WRITE CKSUM         fast        FAULTED      0     0     1           mirror-0  FAULTED      0     0     6             ada0p2  ONLINE       0     0     6             ada1p2  ONLINE       0     0     6 unfortunately, I've exported pool before attempt to zpool clear -F Is there any way to attempt to import pool readonly to save data from it? I recall a case when advice from here helped me to recover all data from a pool with both devices broken... -- Alex -- Это сообщение проверено на вирусы антивирусом Avast. https://www.avast.com/antivirus From owner-freebsd-fs@freebsd.org Fri Mar 27 19:10:41 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id C045027EA40 for ; Fri, 27 Mar 2020 19:10:41 +0000 (UTC) (envelope-from artem@artem.ru) Received: from smtp3.mail.ru (smtp3.mail.ru [94.100.179.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48prz849WBz3Hv8 for ; Fri, 27 Mar 2020 19:10:27 +0000 (UTC) (envelope-from artem@artem.ru) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject; bh=zf24HaFuMa7uZENFKQbkBO1VFlgL+6xJqHBT+vWqxy4=; b=r3Xb45lv3aSXFTYdS/P74g/FDt0iUBr2jSiMR3IlDpIR2WIl4SQiHKSj3XHUGiIQDSO3vqcK9KHv6rWfKJjHHeLur57S8Z61/fTVM7mIh9FgQiWw3xo2f4tvOh+yhN8rNZL20yd98lG1VZtgqpRhj8+b1vOtHG2fVPaoDwOgXvY=; Received: by smtp3.mail.ru with esmtpa (envelope-from ) id 1jHuMz-00085T-8P for freebsd-fs@freebsd.org; Fri, 27 Mar 2020 22:10:17 +0300 Subject: Re: Recovering bad sectors and smartctl no lba in error report To: freebsd-fs@freebsd.org References: <345b7285-958b-ef52-70a9-084872cf7409@artem.ru> From: Artem Kuchin Message-ID: Date: Fri, 27 Mar 2020 22:10:16 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <345b7285-958b-ef52-70a9-084872cf7409@artem.ru> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: ru X-7564579A: 78E4E2B564C1792B X-77F55803: 0A44E481635329DB0E1AA8A03B392317D32E5E48865217365060145B739F5F5C9A67197840F077DFF688BCB05C26794D8B461D1279ED7268A121DDF2817468F83CA6573D5BB2BA18F76D58A5003F35FA X-7FA49CB5: 0D63561A33F958A55E40F9EAB4C724571EE741D59E2BF33C21EF46E8EF77640E8941B15DA834481FA18204E546F3947CEDCF5861DED71B2F389733CBF5DBD5E9C8A9BA7A39EFB7666BA297DBC24807EA117882F44604297287769387670735209ECD01F8117BC8BEA471835C12D1D977C4224003CC8364767815B9869FA544D8D32BA5DBAC0009BE9E8FC8737B5C2249CB3CB8E9EF962DC476E601842F6C81A12EF20D2F80756B5F012D6517FE479FCD76E601842F6C81A127C277FBC8AE2E8B64627FC97409AA513AA81AA40904B5D99449624AB7ADAF37C2464171CE390C27725E5C173C3A84C3CE9959E2676FD87735872C767BF85DA2F004C906525384306FED454B719173D6462275124DF8B9C9DE2850DD75B2526BE5BFE6E7EFDEDCD789D4C264860C145E X-D57D3AED: Y8kq8+OzVozcFQziTi/Zi1xwo7H2ZNxGP5qz8aO2mjTJzjHGC4ogvVuzB3zfVUBtENeZ6b5av1fnCBE34JUDkWdM6QxE+Ga5d8voMtmXfSpeyDBBgITKOJzUdJfQ/bjz X-Mailru-Sender: 0E9E14D9EC491FBA79C5613A73A5E7B2AE026F17C7AB60D290B9F7E2256F035B732F4DBF80ECACC98A4382C47DA47812C77752E0C033A69E376A1339FE8876DF1FC4F5A70058821069EB1F849E6DBC830DA7A0AF5A3A8387 X-Mras: Ok X-Rspamd-Queue-Id: 48prz849WBz3Hv8 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=pass header.d=mail.ru header.s=mail2 header.b=r3Xb45lv; dmarc=none; spf=none (mx1.freebsd.org: domain of artem@artem.ru has no SPF policy when checking 94.100.179.58) smtp.mailfrom=artem@artem.ru X-Spamd-Result: default: False [-2.28 / 15.00]; ARC_NA(0.00)[]; NEURAL_HAM_MEDIUM(-0.97)[-0.968,0]; R_DKIM_ALLOW(-0.20)[mail.ru:s=mail2]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.98)[-0.977,0]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[artem.ru]; RCPT_COUNT_ONE(0.00)[1]; DKIM_TRACE(0.00)[mail.ru:+]; R_SPF_NA(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; IP_SCORE(0.06)[ipnet: 94.100.176.0/20(0.06), asn: 47764(0.24), country: RU(0.01)]; ASN(0.00)[asn:47764, ipnet:94.100.176.0/20, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_IN_DNSWL_LOW(-0.10)[58.179.100.94.list.dnswl.org : 127.0.5.1] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 19:10:41 -0000 One more strange thing  i found out > rror 5 occurred at disk power-on lifetime: 46151 hours (1922 days + 23 > hours) >   When the command that caused the error occurred, the device was > active or idle. > >   After command completion occurred, registers were: >   ER ST SC SN CL CH DH >   -- -- -- -- -- -- -- >   40 51 a0 08 de 3e 0b  Error: UNC at LBA = 0x0b3ede08 = 188669448 The only error i saw in the log is about swap partition not redable. However, # gpart show =>        34  5860533101  ada2  GPT  (2.7T)           34           6        - free -  (3.0K)           40         128     1  freebsd-boot  (64K)          168     8388608     2  freebsd-swap  (4.0G)      8388776  5852144352     3  freebsd-ufs  (2.7T)   5860533128           7        - free -  (3.5K) Now see to which partition this LBA belongs. The block in the gpart show are 512 b ytes and LBA is 512 too. So, we can just compare numbers. And as you see 188669448 is not in  the swap partition. It is  in FREEBSD-UFS! So, some file is damaged there and i need to know which one. I need a way to map LBA (block) to a file. Linux has debugfs utility, but i haven't found anything like that for freebsd. Artem From owner-freebsd-fs@freebsd.org Fri Mar 27 20:04:54 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 636C227FD41 for ; Fri, 27 Mar 2020 20:04:54 +0000 (UTC) (envelope-from artem@artem.ru) Received: from fallback19.mail.ru (fallback19.mail.ru [185.5.136.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48pt9h49c3z47vT for ; Fri, 27 Mar 2020 20:04:39 +0000 (UTC) (envelope-from artem@artem.ru) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Date:Message-ID:To:Subject:From; bh=1SYegLgR0MGz0JkVxoka2HYjX6fcy/l2KoVj0MVQc4Q=; b=AhJfL/MFlLm3DxKMNJ5466e54HiesUlp2uJ2pUMm9Tz0T1XY4eVw847lnnuI0NhMDx6QSOspKy6TNGzm9P8GCB7danT+ZEVW1T1yAZeEZOSqUmgU1Wwwrts+UadIb7A0SRjzF/PsHsOxvgScOxdwQRZOcY323Y3yS9xrQ+Y8Evg=; Received: from [10.161.76.74] (port=60540 helo=smtp15.mail.ru) by fallback19.m.smailru.net with esmtp (envelope-from ) id 1jHvDQ-0004wh-BW for freebsd-fs@freebsd.org; Fri, 27 Mar 2020 23:04:28 +0300 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mail.ru; s=mail2; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Date:Message-ID:To:Subject:From; bh=1SYegLgR0MGz0JkVxoka2HYjX6fcy/l2KoVj0MVQc4Q=; b=AhJfL/MFlLm3DxKMNJ5466e54HiesUlp2uJ2pUMm9Tz0T1XY4eVw847lnnuI0NhMDx6QSOspKy6TNGzm9P8GCB7danT+ZEVW1T1yAZeEZOSqUmgU1Wwwrts+UadIb7A0SRjzF/PsHsOxvgScOxdwQRZOcY323Y3yS9xrQ+Y8Evg=; Received: by smtp15.mail.ru with esmtpa (envelope-from ) id 1jHvDN-0003oO-6O for freebsd-fs@freebsd.org; Fri, 27 Mar 2020 23:04:25 +0300 From: Artem Kuchin Subject: fsdb findblk does not do anything To: freebsd-fs@freebsd.org Message-ID: <78d6f646-d981-dec4-849e-4637dba63d5f@artem.ru> Date: Fri, 27 Mar 2020 23:04:23 +0300 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: ru X-7564579A: B8F34718100C35BD X-77F55803: 0A44E481635329DB0E1AA8A03B392317D32E5E48865217365060145B739F5F5C37D2C72149323728F688BCB05C26794DA8C11BDB09AA8F94BB984C0AFDD2C4380ADED98AD37B51C671965D53AC3678FF X-7FA49CB5: FF5795518A3D127A4AD6D5ED66289B5278DA827A17800CE75263010198C72082EA1F7E6F0F101C67BD4B6F7A4D31EC0BCC500DACC3FED6E28638F802B75D45FF8AA50765F790063710956D442BCD098F8638F802B75D45FF5571747095F342E8C7A0BC55FA0FE5FC70C65C9CC2C43AF446D199941F21D2579944EF21ECA1BA5A389733CBF5DBD5E913377AFFFEAFD269A417C69337E82CC2CC7F00164DA146DAFE8445B8C89999725571747095F342E8C26CFBAC0749D213D2E47CDBA5A9658359CC434672EE6371117882F4460429728AD0CFFFB425014E40A5AABA2AD371193AA81AA40904B5D9A18204E546F3947C8ADF99E4698B9BE82D242C3BD2E3F4C64AD6D5ED66289B52E1A3F18E62937ED6302FCEF25BFAB345725E5C173C3A84C3CF30E10DBCAF6590BA3038C0950A5D36B5C8C57E37DE458B0B4866841D68ED3522CA9DD8327EE4930A3850AC1BE2E73525A4AB119743A3B3C4224003CC836476C0CAF46E325F83A50BF2EBBBDD9D6B0F05F538519369F3743B503F486389A921A5CC5B56E945C8DA X-D57D3AED: Y8kq8+OzVozcFQziTi/Zi1xwo7H2ZNxGP5qz8aO2mjTJzjHGC4ogvVuzB3zfVUBtENeZ6b5av1fnCBE34JUDkWdM6QxE+Ga5d8voMtmXfSruSPhPpyWez5uf5RTkiCWD X-Mailru-Sender: 0E9E14D9EC491FBA79C5613A73A5E7B21C419CFAA70C87F4895BA4E2697807761BF68DE795CEAA2D8A4382C47DA47812C77752E0C033A69E376A1339FE8876DF1FC4F5A70058821069EB1F849E6DBC830DA7A0AF5A3A8387 X-Mras: Ok X-7564579A: EEAE043A70213CC8 X-77F55803: E8DB3678F13EF3E07F9F52485CB584D7271FD7DF62800FDCF17C890AF76C5A0621D32AB8BC0223193FB24410BF471D9EB1E2F8B8BF864A50 X-7FA49CB5: 0D63561A33F958A52C3583D2FE992946906D5943E199E959CA4B5E8ECC671FDA8941B15DA834481FA18204E546F3947CEDCF5861DED71B2F389733CBF5DBD5E9C8A9BA7A39EFB7666BA297DBC24807EA117882F44604297287769387670735209ECD01F8117BC8BEA471835C12D1D977C4224003CC8364767815B9869FA544D8D32BA5DBAC0009BE9E8FC8737B5C2249982B42E9E729BC7B76E601842F6C81A12EF20D2F80756B5F012D6517FE479FCD76E601842F6C81A127C277FBC8AE2E8B27B7353D1E2DA0453AA81AA40904B5D99449624AB7ADAF3726B9191E2D567F0E725E5C173C3A84C3CE9959E2676FD87735872C767BF85DA2F004C906525384306FED454B719173D6462275124DF8B9C9DE2850DD75B2526BE5BFE6E7EFDEDCD789D4C264860C145E X-D57D3AED: Y8kq8+OzVozcFQziTi/Zi1xwo7H2ZNxGP5qz8aO2mjTJzjHGC4ogvVuzB3zfVUBtENeZ6b5av1fnCBE34JUDkWdM6QxE+Ga5d8voMtmXfSruSPhPpyWezwD/ql6jfy+t X-Mailru-MI: 800 X-Mailru-Sender: A5480F10D64C9005690A5AAFAD0F9BCB9E54EFD9A9145E0421D32AB8BC022319EA64C73FE6F44CDC8A4382C47DA47812C77752E0C033A69E376A1339FE8876DF1FC4F5A70058821069EB1F849E6DBC830DA7A0AF5A3A8387 X-Mras: Ok X-Rspamd-Queue-Id: 48pt9h49c3z47vT X-Spamd-Bar: + Authentication-Results: mx1.freebsd.org; dkim=pass header.d=mail.ru header.s=mail2 header.b=AhJfL/MF; dkim=pass header.d=mail.ru header.s=mail2 header.b=AhJfL/MF; dmarc=none; spf=none (mx1.freebsd.org: domain of artem@artem.ru has no SPF policy when checking 185.5.136.251) smtp.mailfrom=artem@artem.ru X-Spamd-Result: default: False [1.34 / 15.00]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[mail.ru:s=mail2]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; DMARC_NA(0.00)[artem.ru]; NEURAL_SPAM_MEDIUM(0.43)[0.427,0]; RCPT_COUNT_ONE(0.00)[1]; IP_SCORE(0.77)[ip: (3.56), ipnet: 185.5.136.0/22(0.07), asn: 47764(0.24), country: RU(0.01)]; DKIM_TRACE(0.00)[mail.ru:+]; NEURAL_SPAM_LONG(0.43)[0.433,0]; RCVD_IN_DNSWL_NONE(0.00)[251.136.5.185.list.dnswl.org : 127.0.5.0]; R_SPF_NA(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; ASN(0.00)[asn:47764, ipnet:185.5.136.0/22, country:RU]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 20:04:54 -0000 I tried to see how findblk command of fsdb works So,i did # cd / # ls -li       10 -rw-r-----   2 root  wheel         973 Jan 13  2015 .cshrc So, we have  inode 10 for .cshrc # istat /dev/ada2p3 10 inode: 10 Allocated Group: 0 uid / gid: 0 / 0 mode: rrw-r----- size: 973 num of links: 2 Inode Times: Accessed:       2015-01-13 16:44:07 (MSK) File Modified:  2015-01-13 16:44:07 (MSK) Inode Modified: 2015-02-25 16:09:31 (MSK) Direct Blocks: 5080 So,we have block5080 for this file. Letr try to find this inode by block number # fsdb -r /dev/ada2p3 ** /dev/ada2p3 (NO WRITE) Examining file system `/dev/ada2p3' Last Mounted on / current inode: directory I=2 MODE=40755 SIZE=512         BTIME=Nov 12 00:03:46 2014 [0 nsec]         MTIME=Mar 26 10:05:49 2020 [261484000 nsec]         CTIME=Mar 26 10:05:49 2020 [261484000 nsec]         ATIME=Dec 21 06:24:59 2014 [0 nsec] OWNER=root GRP=wheel LINKCNT=21 FLAGS=0 BLKCNT=8 GEN=465e05d4 fsdb (inum: 2)> findblk 5080 fsdb (inum: 2)> I waiting for long time but Nothing... Why? From owner-freebsd-fs@freebsd.org Fri Mar 27 20:59:46 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 881222613CC for ; Fri, 27 Mar 2020 20:59:46 +0000 (UTC) (envelope-from SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48pvP46nhdz4TYR for ; Fri, 27 Mar 2020 20:59:36 +0000 (UTC) (envelope-from SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz) Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 479BC28416; Fri, 27 Mar 2020 21:59:27 +0100 (CET) Received: from illbsd.quip.test (ip-62-24-92-232.net.upcbroadband.cz [62.24.92.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 3A1CE2840C; Fri, 27 Mar 2020 21:59:26 +0100 (CET) Subject: Re: Recovering bad sectors and smartctl no lba in error report To: Artem Kuchin , freebsd-fs@freebsd.org References: <345b7285-958b-ef52-70a9-084872cf7409@artem.ru> From: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: Date: Fri, 27 Mar 2020 21:59:25 +0100 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Firefox/52.0 SeaMonkey/2.49.3 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 48pvP46nhdz4TYR X-Spamd-Bar: ++++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz has no SPF policy when checking 94.124.105.4) smtp.mailfrom=SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz X-Spamd-Result: default: False [4.03 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; IP_SCORE(0.83)[ip: (0.30), ipnet: 94.124.104.0/21(0.15), asn: 42000(3.63), country: CZ(0.09)]; MIME_GOOD(-0.10)[text/plain]; RCVD_TLS_LAST(0.00)[]; DMARC_NA(0.00)[quip.cz]; AUTH_NA(1.00)[]; NEURAL_SPAM_MEDIUM(0.99)[0.995,0]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[4.105.124.94.list.dnswl.org : 127.0.10.0]; NEURAL_SPAM_LONG(1.00)[0.998,0]; R_SPF_NA(0.00)[]; FORGED_SENDER(0.30)[000.fbsd@quip.cz,SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:42000, ipnet:94.124.104.0/21, country:CZ]; FROM_NEQ_ENVFROM(0.00)[000.fbsd@quip.cz,SRS0=pb5C=5M=quip.cz=000.fbsd@elsa.codelab.cz]; MID_RHS_MATCH_FROM(0.00)[] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Mar 2020 20:59:46 -0000 Artem Kuchin wrote on 2020/03/27 20:10: > One more strange thing  i found out > >> rror 5 occurred at disk power-on lifetime: 46151 hours (1922 days + 23 >> hours) >>   When the command that caused the error occurred, the device was >> active or idle. >> >>   After command completion occurred, registers were: >>   ER ST SC SN CL CH DH >>   -- -- -- -- -- -- -- >>   40 51 a0 08 de 3e 0b  Error: UNC at LBA = 0x0b3ede08 = 188669448 > > > The only error i saw in the log is about swap partition not redable. > > However, > > # gpart show > > =>        34  5860533101  ada2  GPT  (2.7T) >           34           6        - free -  (3.0K) >           40         128     1  freebsd-boot  (64K) >          168     8388608     2  freebsd-swap  (4.0G) >      8388776  5852144352     3  freebsd-ufs  (2.7T) >   5860533128           7        - free -  (3.5K) > > Now see to which partition this LBA belongs. The block in the gpart show > are 512 b ytes and > > LBA is 512 too. So, we can just compare numbers. And as you see > 188669448 is not in  the swap > > partition. It is  in FREEBSD-UFS! So, some file is damaged there and i > need to know which one. > > I need a way to map LBA (block) to a file. Linux has debugfs utility, > but i haven't found anything like that I tried this few years ago. It is hard and not reliable to find file belonging to certain LBA. fsdb findblk uses file system block sizes, smartctl LBA reports block in disk block sizes which can be 512 or 4k. Then you need to substract partition offset etc. One way to find the file is "guess". Read the given block by DD, open in editor and see the content of the file. Then you can guess what file it can be. In one case I found it was lighttpd.conf just by seeing the content of the file. In other case it was empty space. Miroslav Lachman From owner-freebsd-fs@freebsd.org Sat Mar 28 12:23:21 2020 Return-Path: Delivered-To: freebsd-fs@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id E098127AABB for ; Sat, 28 Mar 2020 12:23:21 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (mail.lysator.liu.se [IPv6:2001:6b0:17:f0a0::3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 48qHtl2vsvz46vB for ; Sat, 28 Mar 2020 12:23:10 +0000 (UTC) (envelope-from pen@lysator.liu.se) Received: from mail.lysator.liu.se (localhost [127.0.0.1]) by mail.lysator.liu.se (Postfix) with ESMTP id CA4E940007 for ; Sat, 28 Mar 2020 13:22:58 +0100 (CET) Received: by mail.lysator.liu.se (Postfix, from userid 1004) id B592340005; Sat, 28 Mar 2020 13:22:58 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on bernadotte.lysator.liu.se X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.4.2 X-Spam-Score: -1.0 Received: from [IPv6:2001:9b1:28ff:d901:4ca:669b:f651:f44a] (unknown [IPv6:2001:9b1:28ff:d901:4ca:669b:f651:f44a]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.lysator.liu.se (Postfix) with ESMTPSA id 4031E40004; Sat, 28 Mar 2020 13:22:57 +0100 (CET) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.60.0.2.5\)) Subject: Re: ZFS/NFS hickups and some tools to monitor stuff... From: Peter Eriksson In-Reply-To: <66AB88C0-12E8-48A0-9CD7-75B30C15123A@pk1048.com> Date: Sat, 28 Mar 2020 13:22:56 +0100 Cc: "PK1048.COM" Content-Transfer-Encoding: quoted-printable Message-Id: References: <66AB88C0-12E8-48A0-9CD7-75B30C15123A@pk1048.com> To: FreeBSD Filesystems X-Mailer: Apple Mail (2.3608.60.0.2.5) X-Virus-Scanned: ClamAV using ClamSMTP X-Rspamd-Queue-Id: 48qHtl2vsvz46vB X-Spamd-Bar: ---- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=pass (policy=none) header.from=liu.se; spf=pass (mx1.freebsd.org: domain of pen@lysator.liu.se designates 2001:6b0:17:f0a0::3 as permitted sender) smtp.mailfrom=pen@lysator.liu.se X-Spamd-Result: default: False [-4.10 / 15.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000,0]; FROM_HAS_DN(0.00)[]; R_SPF_ALLOW(-0.20)[+a:mail.lysator.liu.se]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-fs@freebsd.org]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; RCVD_COUNT_THREE(0.00)[4]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_ALL(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.a.0.f.7.1.0.0.0.b.6.0.1.0.0.2.list.dnswl.org : 127.0.11.0]; DMARC_POLICY_ALLOW(-0.50)[liu.se,none]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; ASN(0.00)[asn:1653, ipnet:2001:6b0::/32, country:EU]; MID_RHS_MATCH_FROM(0.00)[]; IP_SCORE(-1.80)[ip: (-7.10), ipnet: 2001:6b0::/32(-1.04), asn: 1653(-0.84), country: EU(-0.01)] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 28 Mar 2020 12:23:23 -0000 > On 28 Mar 2020, at 04:07, PK1048.COM wrote: >>=20 >> The last couple of weeks I=E2=80=99ve been fighting with a severe = case of NFS users complaining about slow response times from our (5) = FreeBSD 11.3-RELEASE-p6 file servers. Now even though our SMB (Windows) = users (thankfully since they are like 500 per server vs 50 NFS users) = didn=E2=80=99t see the same slowdown (or atleast didn=E2=80=99t complain = about it) the root cause is probably ZFS-related. >=20 > What is the use case for the NFS users: Unix home directories? VM = image files? etc. Mostly home directories. No VM image files. About 20000 filesystems per = server with around 100 snapshots per filesystem. Around 150-180M = files/directories per server. >> 3. Rsync cloning of data into the servers. Response times up to 15 = minutes was seen=E2=80=A6 Yes, 15 minutes to do a mkdir(=E2=80=9Ctest-dir=E2= =80=9D). Possibly in conjunction with #1 above=E2=80=A6. >=20 > There is something very wrong with your setup with that kind of NFS = response time. >=20 > Maybe I=E2=80=99m asking eh obvious, but what is performance like = natively on the server for these operations? Normal response times: This is from a small Intel NUC running OmniOS so NFS 4.0: $ ./pfst -v /mnt/filur01 [pfst, version 1.7 - Peter Eriksson ] 2020-03-28 12:19:10 [2114 =C2=B5s]: /mnt/filur01: = mkdir("t-omnibus-821-1") 2020-03-28 12:19:10 [ 413 =C2=B5s]: /mnt/filur01: = rmdir("t-omnibus-821-1") 2020-03-28 12:19:11 [1375 =C2=B5s]: /mnt/filur01: = mkdir("t-omnibus-821-2") 2020-03-28 12:19:11 [ 438 =C2=B5s]: /mnt/filur01: = rmdir("t-omnibus-821-2") 2020-03-28 12:19:12 [1329 =C2=B5s]: /mnt/filur01: = mkdir("t-omnibus-821-3") 2020-03-28 12:19:12 [ 428 =C2=B5s]: /mnt/filur01: = rmdir("t-omnibus-821-3") 2020-03-28 12:19:13 [1253 =C2=B5s]: /mnt/filur01: = mkdir("t-omnibus-821-4") 2020-03-28 12:19:13 [ 395 =C2=B5s]: /mnt/filur01: = rmdir("t-omnibus-821-4") Ie, an mkdir() takes around 1-2ms and an rmdir 0.4ms. Same from a CentOS 7 client (different hardware, a Dell workstation, NFS = 4.1): $ ./pfst -v /mnt/filur01 [pfst, version 1.6 - Peter Eriksson ] 2020-03-28 12:21:15 [ 633 =C2=B5s]: /mnt/filur01: = mkdir("t-electra-965-1") 2020-03-28 12:21:15 [ 898 =C2=B5s]: /mnt/filur01: = rmdir("t-electra-965-1") 2020-03-28 12:21:16 [1019 =C2=B5s]: /mnt/filur01: = mkdir("t-electra-965-2") 2020-03-28 12:21:16 [ 709 =C2=B5s]: /mnt/filur01: = rmdir("t-electra-965-2") 2020-03-28 12:21:17 [ 955 =C2=B5s]: /mnt/filur01: = mkdir("t-electra-965-3") 2020-03-28 12:21:17 [ 668 =C2=B5s]: /mnt/filur01: = rmdir("t-electra-965-3") Mkdir & rmdir takes about the same amount of time here. (0.6 - 1ms). (The above only tests operations on already-mounted filesystems. = Mounting filesystems have their own set of partly different problems = occasionally.) > What does the disk %busy look like on the disks that make up the = vdev=E2=80=99s? (iostat -x) Don=E2=80=99t have those numbers (when we were seeing problems) = unfortunately but if I remember correctly fairly busy during the = resilver (not surprising). Current status (right now): # iostat -x 10 |egrep -v pass extended device statistics =20 device r/s w/s kr/s kw/s ms/r ms/w ms/o ms/t qlen = %b =20 nvd0 0 0 0.0 0.0 0 0 0 0 0 = 0=20 da0 3 55 31.1 1129.4 10 1 87 3 0 = 13=20 da1 4 53 31.5 1109.1 10 1 86 3 0 = 13=20 da2 5 51 41.9 1082.4 9 1 87 3 0 = 14=20 da3 3 55 31.1 1129.7 10 1 85 2 0 = 13=20 da4 4 53 31.3 1108.7 10 1 86 3 0 = 13=20 da5 5 52 41.7 1081.3 9 1 87 3 0 = 14=20 da6 3 55 27.6 1103.9 10 1 94 2 0 = 13=20 da7 4 54 34.2 1064.7 10 1 92 3 0 = 14=20 da8 5 55 39.6 1088.2 10 1 94 3 0 = 15=20 da9 3 55 27.7 1103.4 10 1 92 2 0 = 13=20 da10 4 53 34.2 1065.0 10 1 92 3 0 = 14=20 da11 5 55 39.5 1089.0 10 1 95 3 0 = 15=20 da12 1 23 4.7 553.5 0 0 0 0 0 = 0=20 da13 1 23 4.7 553.5 0 0 0 0 0 = 0=20 da14 0 23 1.1 820.6 0 0 0 0 0 = 1=20 da15 0 23 1.0 820.7 0 0 0 0 0 = 1=20 da16 0 0 0.0 0.0 1 0 0 1 0 = 0=20 da17 0 0 0.0 0.0 1 0 0 0 0 = 0=20 >> Previously #1 and #2 hasn=E2=80=99t caused that much problems, and #3 = definitely. Something has changed the last half year or so but so far I = haven=E2=80=99t been able to figure it out. >=20 > The degradation over time makes me think fragmentation of lots of = small writes or an overly full zpool, but that should effect SMB as well = as NFS. Yeah, the zpool is around 50% full so no. (One of the servers right now:) # zpool iostat -v DATA capacity operations = bandwidth pool alloc free read write read = write -------------------------------- ----- ----- ----- ----- ----- = ----- DATA 57.7T 50.3T 46 556 403K = 8.48M raidz2 29.0T 25.0T 23 253 204K = 3.49M diskid/DISK-7PK8DWXC - - 3 52 31.1K = 1.10M diskid/DISK-7PK5X6XC - - 4 50 31.5K = 1.08M diskid/DISK-7PK4W4BC - - 5 49 41.9K = 1.06M diskid/DISK-7PK204LG - - 3 52 31.1K = 1.10M diskid/DISK-7PK2GDHG - - 4 50 31.3K = 1.08M diskid/DISK-7PK7850C - - 5 49 41.7K = 1.06M raidz2 28.8T 25.2T 23 256 199K = 3.39M diskid/DISK-7PK62HHC - - 3 53 27.6K = 1.08M diskid/DISK-7PK6SG3C - - 4 52 34.2K = 1.04M diskid/DISK-7PK8DRHC - - 5 53 39.6K = 1.06M diskid/DISK-7PK85ADG - - 3 53 27.7K = 1.08M diskid/DISK-2TK7PBYD - - 4 51 34.2K = 1.04M diskid/DISK-7PK6WY9C - - 5 53 39.5K = 1.06M logs - - - - - = - diskid/DISK-BTHV7146043U400NGN 289M 372G 0 23 0 = 821K diskid/DISK-BTHV71460441400NGN 274M 372G 0 23 0 = 821K > What type of SSD=E2=80=99s are you using for the SLOG (every zpool has = a built-in ZIL, you have a SLOG when you add separate devices for the = Intent Log for sync operations)? I had tried a run of the mill consumer = SSD for the SLOG and NFS sync performance went DOWN! For SLOG devices = you really must have real enterprise grade server class SSDs designed = for very heavy random WRITE load. Today you can get enterprise SSDs = designed for READ, WRITE, or a combination. Dell-original Enterprise SSDs like Intel DC S3520-series & Intel DC = S3700-series (SATA) Also tested some Intel SSD 750 (NVME on PCIe 3.0-cards) Going to test some pure SAS SSD write-optimised drives too (as soon as = we can get delivery of them). However I don=E2=80=99t really suspect the = SSDs anymore. (We=E2=80=99ve had other issues with the S3520:series though so = might replace them anyway - they have a tendency to play =E2=80=9Cpossum=E2= =80=9D every now and then - =E2=80=9Cdie=E2=80=9D and go offline - = however if we just let them sit there after a couple of weeks they would = reappear again automagically=E2=80=A6 We=E2=80=99ve seen the same thing = happen with the Intel P3520=E2=80=99s too (PCIe variant) but then it = just didn=E2=80=99t drop off the bus - it took down the server too due = to hanging the PCIe bus - crappy buggers) > Depending on your NFS use case, you may or may not be seeing sync = operations. If you are seeing async operations then a ZIL/SLOG will not = help as they are only used for sync operations. For async operations = multiple writes are combined in RAM and committed to disk at least every = 5 seconds. For sync operations the writes are immediately written to the = ZIL/SLOG, so a SLOG _can_ be of benefit if it is fast enough for random = write operations. Since a sync operation must be committed to = non-volatile storage before the write call can return, and having a = single SLOG device can lead to loss of data, it is prudent to mirror the = SLOG devices. Having a mirrored pair does not slow down operations as = both devices are written to in parallel. The SLOG only exists to meet = the sync requirement. If the system does not crash, the SLOG is (almost) = never actually read. The exception, and the reason I say =E2=80=9Calmost=E2= =80=9D, is that if you are writing so much data over long periods of = time that the zpool disks cannot catch up between write requests, then = you may have to read some TXG (transaction groups) from the SLOG to get = them into the zpool. >=20 >> We=E2=80=99ve tried it with and without L2ARC, >=20 > ARC and L2ARC will only effect read operations, and you can see the = effect with zfs-stats -E. Well, L2ARC uses up some of the ARC so if ARC is too tight from the = beginning then it=E2=80=99ll make reads go even slower.=20 Which actually is my main suspect (the ARC) right now. At least for the = resilver problem. When the resilver was running it seems it would read = in a lot of data into the ARC causing it to drop older data (& metadata=3D= from the ARC cache. Which then became a problem due to two issues: 1. The problematic server had =E2=80=9Ccompressed ARC=E2=80=9D disabled = due to a previous kernel-crashing problem (happened less often with that = disabled)=20 2. ZFS prioritises writes over reads 3. The metadata (directory information for example) wasn=E2=80=99t in = the ARC anymore so needed to be read from disk again and again This together with the fact that we: We are taking hourly snapshots of all filesystems - this needed to = access metadata for all filesystems. We are cleaning snapshots nightly - this needed to access metadata for = all filesystems (and all old snapshots). We had a problem with the cleaning script so it was running behind and = more and more snapshots were piling up on a couple of the fileservers so = instead of 100 snapshots there were around 400. Including the server = where we were resilvering. The 500-600 samba =E2=80=9Csmbd=E2=80=9D processes are using a lot of = RAM each (100-200MB), ie 120-60GB, so would compete with the ARC too. Many of the NFS operations access metadata from the filesystems. For = example mount operations, or filename lookups. So if any of that was = evicted from the ARC then it would have to go to disk to read it again. = And when ZFS was busy writing resilver data that would take a long time. Or when a filesystem is near quota then that too will cause the ZFS = transaction sizse to go down - inflating write IOPS and starving the = reads IOPs again=E2=80=A6 We are probably going to invest in more RAM for the servers (going from = 256GB to 512GB or 768GB) to allow for way more ARC. > You should not need to make any system tunings to make this work. Well, I had to tune down vfs.zfs.arc_max from the default (90% of RAM) = to a 128GB since the Samba smbd processes also need RAM to run and the = auto-tuning in FreeBSD of the ARC doesn=E2=80=99t seem to work that = well. In order to mitigate that problem I=E2=80=99m currently using = these settings: vfs.zfs.arc_max =3D 96GB (tuned down from 128GB since ARC overshot the = 128GB allocated and started killing random processes)=20 vfs.zfs.arc_meta_limit =3D 50% of ARC (default is 25%) The overshooting of ARC (especially when resilvering) was probably due = to us also tuning the kern.maxvnodes - seems if you touch that number = when the system is already running ARC goes berserk. So I can probably = increase ARC to 128GB again. It looks to be more stable now that we=E2=80=99= ve removed the kern.maxvnodes setting completely. It would have been nice to that a setting to set a minimum limit of = metadata so it would be more protected from normal data evicting it=E2=80=A6= =20 We=E2=80=99ve installed 512GB instead of 256GB in a test server now and = will so what settings might work well there... Right now (with no resilver running and not much =E2=80=9Cnear-full-quota=E2= =80=9D filesystems, and many of the extraneous snapshots deleted this = are looking much better: $ ./pfst -t200ms /mnt/filur0* =E2=80=A6 2020-03-26 21:16:40 [1678 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190284") [Time limit exceeded] 2020-03-26 21:16:47 [1956 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190289") [Time limit exceeded] 2020-03-26 21:16:53 [1439 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190293") [Time limit exceeded] 2020-03-26 21:16:59 [1710 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190297") [Time limit exceeded] 2020-03-26 21:17:05 [2044 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190301") [Time limit exceeded] 2020-03-26 21:17:11 [1955 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190305") [Time limit exceeded] 2020-03-26 21:17:16 [1515 ms]: /mnt/filur01: = mkdir("t-omnibus-782-190309") [Time limit exceeded] ... 2020-03-28 06:54:54 [ 370 ms]: /mnt/filur06: = mkdir("t-omnibus-783-311285") [Time limit exceeded] 2020-03-28 07:00:01 [ 447 ms]: /mnt/filur01: = mkdir("t-omnibus-782-311339") [Time limit exceeded] 2020-03-28 07:00:01 [ 312 ms]: /mnt/filur06: = mkdir("t-omnibus-783-311591") [Time limit exceeded] 2020-03-28 07:00:02 [ 291 ms]: /mnt/filur06: = mkdir("t-omnibus-783-311592") [Time limit exceeded] 2020-03-28 07:00:05 [ 378 ms]: /mnt/filur06: = mkdir("t-omnibus-783-311594") [Time limit exceeded] 2020-03-28 10:35:13 [1876 ms]: /mnt/filur01: = mkdir("t-omnibus-782-324215") [Time limit exceeded] (This only prints operations taking more than 200ms) The last ~2s one is probably due to a new filesystem being created. This = triggers the NFS mount daemon to update the kernel list of exports (and = hold a NFS kernel lock) which take some time (although it=E2=80=99s much = faster now with the =E2=80=9Cincremental update=E2=80=9D fix in mountd.=20= =20 The slow ones around 21:16 is due to me doing a number of =E2=80=9Czfs = rename DATA/home/$USER DATA/archive/home/$USER=E2=80=9D operations. This = too causes a lot of NFS mountd activity (removing NFS exports from the = kernel) causing a NFS kernel lock to be held... - Peter