Date: Wed, 16 Jan 2013 22:59:50 +0000 (UTC) From: Xin LI <delphij@FreeBSD.org> To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r245511 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <201301162259.r0GMxocD092929@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: delphij Date: Wed Jan 16 22:59:50 2013 New Revision: 245511 URL: http://svnweb.freebsd.org/changeset/base/245511 Log: MFV r245510: improve the comment in txg.c Obtained from: Illumos (13910:f3454e0a097c) MFC after: 2 weeks Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Directory Properties: head/sys/cddl/contrib/opensolaris/ (props changed) Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Jan 16 22:50:40 2013 (r245510) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Wed Jan 16 22:59:50 2013 (r245511) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright 2011 Martin Matuska <mm@FreeBSD.org> - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include <sys/zfs_context.h> @@ -33,7 +33,76 @@ #include <sys/callb.h> /* - * Pool-wide transaction groups. + * ZFS Transaction Groups + * ---------------------- + * + * ZFS transaction groups are, as the name implies, groups of transactions + * that act on persistent state. ZFS asserts consistency at the granularity of + * these transaction groups. Each successive transaction group (txg) is + * assigned a 64-bit consecutive identifier. There are three active + * transaction group states: open, quiescing, or syncing. At any given time, + * there may be an active txg associated with each state; each active txg may + * either be processing, or blocked waiting to enter the next state. There may + * be up to three active txgs, and there is always a txg in the open state + * (though it may be blocked waiting to enter the quiescing state). In broad + * strokes, transactions — operations that change in-memory structures — are + * accepted into the txg in the open state, and are completed while the txg is + * in the open or quiescing states. The accumulated changes are written to + * disk in the syncing state. + * + * Open + * + * When a new txg becomes active, it first enters the open state. New + * transactions — updates to in-memory structures — are assigned to the + * currently open txg. There is always a txg in the open state so that ZFS can + * accept new changes (though the txg may refuse new changes if it has hit + * some limit). ZFS advances the open txg to the next state for a variety of + * reasons such as it hitting a time or size threshold, or the execution of an + * administrative action that must be completed in the syncing state. + * + * Quiescing + * + * After a txg exits the open state, it enters the quiescing state. The + * quiescing state is intended to provide a buffer between accepting new + * transactions in the open state and writing them out to stable storage in + * the syncing state. While quiescing, transactions can continue their + * operation without delaying either of the other states. Typically, a txg is + * in the quiescing state very briefly since the operations are bounded by + * software latencies rather than, say, slower I/O latencies. After all + * transactions complete, the txg is ready to enter the next state. + * + * Syncing + * + * In the syncing state, the in-memory state built up during the open and (to + * a lesser degree) the quiescing states is written to stable storage. The + * process of writing out modified data can, in turn modify more data. For + * example when we write new blocks, we need to allocate space for them; those + * allocations modify metadata (space maps)... which themselves must be + * written to stable storage. During the sync state, ZFS iterates, writing out + * data until it converges and all in-memory changes have been written out. + * The first such pass is the largest as it encompasses all the modified user + * data (as opposed to filesystem metadata). Subsequent passes typically have + * far less data to write as they consist exclusively of filesystem metadata. + * + * To ensure convergence, after a certain number of passes ZFS begins + * overwriting locations on stable storage that had been allocated earlier in + * the syncing state (and subsequently freed). ZFS usually allocates new + * blocks to optimize for large, continuous, writes. For the syncing state to + * converge however it must complete a pass where no new blocks are allocated + * since each allocation requires a modification of persistent metadata. + * Further, to hasten convergence, after a prescribed number of passes, ZFS + * also defers frees, and stops compressing. + * + * In addition to writing out user data, we must also execute synctasks during + * the syncing context. A synctask is the mechanism by which some + * administrative activities work such as creating and destroying snapshots or + * datasets. Note that when a synctask is initiated it enters the open txg, + * and ZFS then pushes that txg as quickly as possible to completion of the + * syncing state in order to reduce the latency of the administrative + * activity. To complete the syncing state, ZFS writes out a new uberblock, + * the root of the tree of blocks that comprise all state stored on the ZFS + * pool. Finally, if there is a quiesced txg waiting, we signal that it can + * now transition to the syncing state. */ static void txg_sync_thread(void *arg);
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201301162259.r0GMxocD092929>