Date: Tue, 8 Aug 2017 11:26:03 +0000 (UTC) From: Andriy Gapon <avg@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r322245 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <201708081126.v78BQ3Lr047571@repo.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: avg Date: Tue Aug 8 11:26:03 2017 New Revision: 322245 URL: https://svnweb.freebsd.org/changeset/base/322245 Log: MFV r322242: 8373 TXG_WAIT in ZIL commit path illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7 https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7 https://www.illumos.org/issues/8373 The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign a transaction to a transaction group. That seems to be logically incorrect as writing of the ZIL block does not introduce any new dirty data. Also, when there is a lot of dirty data, the call can introduce significant delays into the ZIL commit path, thus affecting all synchronous writes. Additionally, ARC throttling may affect the ZIL writing. Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Prakash Surya <prakash.surya@delphix.com> Approved by: Dan McDonald <danmcd@joyent.com> Author: Andriy Gapon <avg@FreeBSD.org> MFC after: 2 weeks Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Directory Properties: head/sys/cddl/contrib/opensolaris/ (props changed) Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Tue Aug 8 11:25:09 2017 (r322244) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c Tue Aug 8 11:26:03 2017 (r322245) @@ -985,7 +985,24 @@ zil_lwb_write_start(zilog_t *zilog, lwb_t *lwb, boolea * to clean up in the event of allocation failure or I/O failure. */ tx = dmu_tx_create(zilog->zl_os); - VERIFY(dmu_tx_assign(tx, TXG_WAIT) == 0); + + /* + * Since we are not going to create any new dirty data and we can even + * help with clearing the existing dirty data, we should not be subject + * to the dirty data based delays. + * We (ab)use TXG_WAITED to bypass the delay mechanism. + * One side effect from using TXG_WAITED is that dmu_tx_assign() can + * fail if the pool is suspended. Those are dramatic circumstances, + * so we return NULL to signal that the normal ZIL processing is not + * possible and txg_wait_synced() should be used to ensure that the data + * is on disk. + */ + error = dmu_tx_assign(tx, TXG_WAITED); + if (error != 0) { + ASSERT3S(error, ==, EIO); + dmu_tx_abort(tx); + return (NULL); + } dsl_dataset_dirty(dmu_objset_ds(zilog->zl_os), tx); txg = dmu_tx_get_txg(tx);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201708081126.v78BQ3Lr047571>