From owner-freebsd-stable@freebsd.org Tue Feb 23 17:31:13 2021 Return-Path: Delivered-To: freebsd-stable@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id F3DFB54B5D1 for ; Tue, 23 Feb 2021 17:31:12 +0000 (UTC) (envelope-from chris.anderson@gmail.com) Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1O1" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4DlR0v6p6lz3rTx; Tue, 23 Feb 2021 17:31:11 +0000 (UTC) (envelope-from chris.anderson@gmail.com) Received: by mail-wr1-f52.google.com with SMTP id y17so1850153wrs.12; Tue, 23 Feb 2021 09:31:11 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=O3WRJdIcG47dxxyChMkDwS9Ki8O6gEgXfrqY9N2h/0A=; b=CUwK24F95uLYSLKgpFZ4UM2R3GMIajMiaxNrxFz5jIa/QzL2DsfPK8VCEBbFtSzzTk TJtn3rowcOwho2TqBwtTfUXt69eOZFIOLGhyalBWF37CFOuRfO8MjknbhQPBBJiMq11k Aoj5wQQ6g/UYYNII0cUSXHTfeSqZNNUWcGRrgtuAhEMj25Da49l4iSkfIAory8WVrdm1 s7LvqvXA1C4XMZtSSSSHaEyfIGyw0RYBi4hbMMnl272zHvKC6iju30afEkjD4rOC1iEw 7NQctC8ZrMskEGB+edoLKy+LDcZ6hWRZvTxr8CILl9M7ruPw6rCKw9IL0yq+0gBniD3C KFNA== X-Gm-Message-State: AOAM533J/mTdJm8a0CGQhvnWooS5se4On2AzImF94StCqz+/RMix2pkT nxkQ0AGmj9YxFeJnjkg6qxX0ptXNDwkYqtznTR0aFnVh X-Google-Smtp-Source: ABdhPJyGnbW0ZoZpYkbFgMXtolhYgrjj10JF0RK/xe7YG6TBxLj4l9QZjrsvOqZ5LgoDHSPHx+jwJJXISlUpqYDdUVU= X-Received: by 2002:adf:a2c7:: with SMTP id t7mr27632335wra.42.1614101469636; Tue, 23 Feb 2021 09:31:09 -0800 (PST) MIME-Version: 1.0 References: <48b78acb-7667-7829-8dd0-e753b7ac3336@FreeBSD.org> In-Reply-To: From: Chris Anderson Date: Tue, 23 Feb 2021 11:30:58 -0600 Message-ID: Subject: Re: lots of "no such file or directory" errors in zfs filesystem To: Andriy Gapon Cc: freebsd-stable@freebsd.org X-Rspamd-Queue-Id: 4DlR0v6p6lz3rTx X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=pobox.com (policy=none); spf=pass (mx1.freebsd.org: domain of chrisanderson@gmail.com designates 209.85.221.52 as permitted sender) smtp.mailfrom=chrisanderson@gmail.com X-Spamd-Result: default: False [-2.90 / 15.00]; FROM_NEQ_ENVFROM(0.00)[cva@pobox.com,chrisanderson@gmail.com]; MAILMAN_DEST(0.00)[freebsd-stable]; DMARC_POLICY_SOFTFAIL(0.10)[pobox.com : SPF not aligned (relaxed), No valid DKIM,none]; RCVD_TLS_ALL(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_SPF_ALLOW(-0.20)[+ip4:209.85.128.0/17]; MIME_GOOD(-0.10)[multipart/alternative,text/plain]; ARC_NA(0.00)[]; SPAMHAUS_ZRD(0.00)[209.85.221.52:from:127.0.2.255]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; RBL_DBL_DONT_QUERY_IPS(0.00)[209.85.221.52:from]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_TWO(0.00)[2]; RCVD_IN_DNSWL_NONE(0.00)[209.85.221.52:from]; NEURAL_HAM_SHORT(-1.00)[-1.000]; FORGED_SENDER(0.30)[cva@pobox.com,chrisanderson@gmail.com]; RWL_MAILSPIKE_POSSIBLE(0.00)[209.85.221.52:from]; R_DKIM_NA(0.00)[]; FREEMAIL_ENVFROM(0.00)[gmail.com]; RCVD_COUNT_TWO(0.00)[2]; MIME_TRACE(0.00)[0:+,1:+,2:~]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:209.85.128.0/17, country:US] Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.34 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2021 17:31:13 -0000 On Tue, Feb 23, 2021 at 4:53 AM Andriy Gapon wrote: > On 23/02/2021 05:25, Chris Anderson wrote: > > so I can't ls -i the file since that triggers the no such file warning. > if I run > > zdb -dddd on the inode of a directory which contains one of those > missing files, > > I can get the inode of the file from that, but I don't get anything > particularly > > interesting in the output. > > > > most of the files that are missing are in directories with a large > number of > > files (the largest has 180k) but I managed to find a directory which had > a > > single file entry that is missing: > > > > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, > rootbp > > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] > > fletcher4 uncompressed LE contiguous unique double size=800L/800P > > birth=46916371L/46916371P fill=908537 > > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 > > > > > > Object lvl iblk dblk dsize dnsize lsize %full type > > > > 38268 1 128K 1K 0 512 1K 100.00 ZFS directory > > > > 264 bonus ZFS znode > > > > dnode flags: USED_BYTES USERUSED_ACCOUNTED > > > > dnode maxblkid: 0 > > > > uid 1001 > > > > gid 1001 > > > > atime Sun Aug 6 02:00:41 2017 > > > > mtime Wed Apr 15 12:12:42 2020 > > > > ctime Wed Apr 15 12:12:42 2020 > > > > crtime Sat Aug 5 15:10:07 2017 > > > > gen 23881023 > > > > mode 40755 > > > > size 3 > > > > parent 38176 > > > > links 2 > > > > pflags 40800000144 > > > > xattr 0 > > > > rdev 0x0000000000000000 > > > > microzap: 1024 bytes, 1 entries > > > > > > > > hash_test.go = 38274 (type: Regular File) > > > > > > # zdb -dddd tank/home/cva 38274 > > > > Dataset tank/home/cva [ZPL], ID 196, cr_txg 163, 109G, 908537 objects, > rootbp > > DVA[0]=<0:13210311000:1000> DVA[1]=<0:18b9a02c000:1000> [L0 DMU objset] > > fletcher4 uncompressed LE contiguous unique double size=800L/800P > > birth=46916371L/46916371P fill=908537 > > cksum=11fdd21d1d:13cb24c87a6e:da0c9bf1b5df3:715ab2ec45b7b09 > > > > > > Object lvl iblk dblk dsize dnsize lsize %full type > > > > zdb: dmu_bonus_hold(38274) failed, errno 2 > > So, this looks like a "simple" problem. > Unfortunately, it is very hard to tell retrospectively what bug caused it. > The directory has an entry for the file, but the file does not actually > exist > (or has a different ID). > This is a logical inconsistency, not a data integrity issue. > So, a scrub, being a data integrity check, would not detect such an issue. > Hypothetical zfs_fsck is needed to find and repair such logical problems. > ah, I see. that makes sense. > Does that pool and filesystem have any special history? > I mean upgrades, replication via send/recv, moving between OS-s, etc. > nope, it led a pretty boring life. that zfs filesystem was created on that server and has been on the same two mirrored disks for its lifetime. it has had freebsd upgrades applied as they became available. zfs upgrades were for the most part avoided until quite recently (though the problem existed prior to the upgrades) the server does have a relatively modest amount of ram (2GB). dunno if that makes it more likely that these kinds of issues get triggered.