Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 22 Jan 2018 00:19:50 +0000 (UTC)
From:      Alexander Motin <mav@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org
Subject:   svn commit: r328235 - stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs
Message-ID:  <201801220019.w0M0JoaO020827@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: mav
Date: Mon Jan 22 00:19:50 2018
New Revision: 328235
URL: https://svnweb.freebsd.org/changeset/base/328235

Log:
  MFC r322245: MFV r322242: 8373 TXG_WAIT in ZIL commit path
  
  illumos/illumos-gate@d28671a3b094af696bea87f52272d4c4d89321c7
  https://github.com/illumos/illumos-gate/commit/d28671a3b094af696bea87f52272d4c4d89321c7
  
  https://www.illumos.org/issues/8373
    The code that writes ZIL blocks uses dmu_tx_assign(TXG_WAIT) to assign
    a transaction to a transaction group.  That seems to be logically
    incorrect as writing of the ZIL block does not introduce any new dirty
    data.  Also, when there is a lot of dirty data, the call can introduce
    significant delays into the ZIL commit path, thus affecting all
    synchronous writes. Additionally, ARC throttling may affect the ZIL
    writing.
  
  Reviewed by: Matthew Ahrens <mahrens@delphix.com>
  Reviewed by: Prakash Surya <prakash.surya@delphix.com>
  Approved by: Dan McDonald <danmcd@joyent.com>
  Author: Andriy Gapon <avg@FreeBSD.org>

Modified:
  stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c
Directory Properties:
  stable/11/   (props changed)

Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c
==============================================================================
--- stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	Mon Jan 22 00:01:36 2018	(r328234)
+++ stable/11/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	Mon Jan 22 00:19:50 2018	(r328235)
@@ -1227,7 +1227,24 @@ zil_lwb_write_issue(zilog_t *zilog, lwb_t *lwb)
 	 */
 
 	tx = dmu_tx_create(zilog->zl_os);
-	VERIFY(dmu_tx_assign(tx, TXG_WAIT) == 0);
+
+	/*
+	 * Since we are not going to create any new dirty data and we can even
+	 * help with clearing the existing dirty data, we should not be subject
+	 * to the dirty data based delays.
+	 * We (ab)use TXG_WAITED to bypass the delay mechanism.
+	 * One side effect from using TXG_WAITED is that dmu_tx_assign() can
+	 * fail if the pool is suspended.  Those are dramatic circumstances,
+	 * so we return NULL to signal that the normal ZIL processing is not
+	 * possible and txg_wait_synced() should be used to ensure that the data
+	 * is on disk.
+	 */
+	error = dmu_tx_assign(tx, TXG_WAITED);
+	if (error != 0) {
+		ASSERT3S(error, ==, EIO);
+		dmu_tx_abort(tx);
+		return (NULL);
+	}
 	dsl_dataset_dirty(dmu_objset_ds(zilog->zl_os), tx);
 	txg = dmu_tx_get_txg(tx);
 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201801220019.w0M0JoaO020827>