Date: Sun, 12 Apr 2020 14:56:01 +0300 From: Andriy Gapon <avg@FreeBSD.org> To: Peter Eriksson <pen@lysator.liu.se>, freebsd-fs <freebsd-fs@FreeBSD.org> Subject: Re: ZFS server has gone crazy slow Message-ID: <2f49fe01-50dc-2e72-8cd6-beb6f0fbb621@FreeBSD.org> In-Reply-To: <747B75C0-73D7-42B2-9910-9E16FCAE23C4@lysator.liu.se> References: <2182C27C-A5D3-41BF-9CE9-7C6883E43074@distal.com> <20200411174831.GA54397@fuz.su> <6190573D-BCA7-44F9-86BD-0DCBB1F69D1D@distal.com> <6fd7a561-462e-242d-5057-51c52d716d68@wp.pl> <7AA1EA07-6041-464A-A39A-158ACD1DC11C@distal.com> <FE84C045-89B1-4772-AF1F-35F78B9877D8@lysator.liu.se> <575c01de-b503-f4f9-2f13-f57f428f53ec@FreeBSD.org> <747B75C0-73D7-42B2-9910-9E16FCAE23C4@lysator.liu.se>
next in thread | previous in thread | raw e-mail | index | archive | help
On 12/04/2020 14:46, Peter Eriksson wrote: > You are probably right. > > However - we have seen (thru experimentation :-) that “zfs destroy -d” for > recursive snapshot destruction on many filesystems (recursively) seemed to > allow it to be done much faster (Ie the command finished much quicker) on our > servers. But it also meant that a lot of I/O seemed to be happening quite > some time after the last “zfs destroy -d” command was issued (and a really > long time when there were near-quota-full filesystems). No clones or “user > holds” in use here as far as I know. Why that is I don’t know. With “zfs > destroy” (no “-d”) things seems to be much more synchronous. > > We’ve stopped using “-d” now since we’d rather not have that type of I/O load > be happening during daytime and we had some issues with some nightly snapshot > cleanup jobs not finishing in time. I think that you want to re-test zfs destroy vs destroy -d and do that rigorously. I am not sure how to explain what you saw as it cannot be explained by how destroy -d actually differs from plain destroy. Maybe it it was a cold vs warmed up (with respect to the ARC) system, maybe something else, but certainly not -d if you do not have user holds and clones. Just in case, here is the only place in the code where 'defer' actually makes difference. dsl_destroy_snapshot_sync_impl: if (defer && (ds->ds_userrefs > 0 || dsl_dataset_phys(ds)->ds_num_children > 1)) { ASSERT(spa_version(dp->dp_spa) >= SPA_VERSION_USERREFS); dmu_buf_will_dirty(ds->ds_dbuf, tx); dsl_dataset_phys(ds)->ds_flags |= DS_FLAG_DEFER_DESTROY; spa_history_log_internal_ds(ds, "defer_destroy", tx, ""); return; } > Anyway, the “seems to be writing out a lot of queued up ZIL data” at “zfs > mount -a” time was definitely a real problem - it mounted most of the > filesystems pretty quickly but then was “extremely slow” for a couple of them > (and was causing a lot of I/O). Like 4-6 hours. Luckily that one was one of > our backup servers and during a time when the only one it frustrated was me… > I’d hate that to happen for one of the frontend (NFS/SMB-serving) servers > during office hours :-) I don't doubt that, I just tried to explain that whatever was in ZIL could not come from zfs destroy. It was something else. >> On 12 Apr 2020, at 13:26, Andriy Gapon <avg@FreeBSD.org> wrote: >> >> >> On 12/04/2020 00:24, Peter Eriksson wrote: >>> Another fun thing that might happen is if you reboot your server and >>> happen to have a lot of queued up writes in the ZIL (for example if you >>> did a “zfs destroy -d -r POOL@snapshots” (deferred(background) destroys >>> of snapshots) and do a hard reboot while it’s busy it will “write out” >>> those queued transactions at filesystem mount time during the boot >>> sequence >> >> Just nitpicking on two bits of incorrect information here. First, zfs >> destroy never uses ZIL. Never. ZIL is used only for ZPL operations like >> file writes, renames, removes, etc. The things that you can do with Posix >> system calls (~ VFS KPI). >> >> Second, zfs destroy -d is not a background destroy. It is a deferred >> destroy. That means that either the destroy is done immediately if a >> snapshot has no holds which means no user holds and no clones. Or the >> destroy is postponed until holds are gone, that is, the last clone or the >> last user hold is removed. >> >> Note, however, that unless you have a very ancient pool version destroying >> a snapshot means that the snapshot object is removed and all blocks >> belonging to the snapshot are queued for freeing. Their actual freeing is >> done asynchronously ("in background") and can be spread over multiple TXG >> periods. That's done regardless of whether -d was used. -- Andriy Gapon
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2f49fe01-50dc-2e72-8cd6-beb6f0fbb621>