Date: Wed, 14 Aug 2013 23:34:16 +0000 (UTC) From: Warren Block <wblock@FreeBSD.org> To: doc-committers@freebsd.org, svn-doc-projects@freebsd.org Subject: svn commit: r42542 - projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs Message-ID: <201308142334.r7ENYGR9021849@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: wblock Date: Wed Aug 14 23:34:16 2013 New Revision: 42542 URL: http://svnweb.freebsd.org/changeset/doc/42542 Log: Whitespace-only fixes. Translators, please ignore. Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml ============================================================================== --- projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Wed Aug 14 22:29:07 2013 (r42541) +++ projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Wed Aug 14 23:34:16 2013 (r42542) @@ -15,723 +15,729 @@ </authorgroup> </chapterinfo> - <title>The Z File System (ZFS)</title> + <title>The Z File System (ZFS)</title> - <para>The Z file system, originally developed by &sun;, - is designed to future proof the file system by removing many of - the arbitrary limits imposed on previous file systems. ZFS - allows continuous growth of the pooled storage by adding - additional devices. ZFS allows you to create many file systems - (in addition to block devices) out of a single shared pool of - storage. Space is allocated as needed, so all remaining free - space is available to each file system in the pool. It is also - designed for maximum data integrity, supporting data snapshots, - multiple copies, and cryptographic checksums. It uses a - software data replication model, known as - <acronym>RAID</acronym>-Z. <acronym>RAID</acronym>-Z provides - redundancy similar to hardware <acronym>RAID</acronym>, but is - designed to prevent data write corruption and to overcome some - of the limitations of hardware <acronym>RAID</acronym>.</para> - - <sect1 id="filesystems-zfs-term"> - <title>ZFS Features and Terminology</title> - - <para>ZFS is a fundamentally different file system because it - is more than just a file system. ZFS combines the roles of - file system and volume manager, enabling additional storage - devices to be added to a live system and having the new space - available on all of the existing file systems in that pool - immediately. By combining the traditionally separate roles, - ZFS is able to overcome previous limitations that prevented - RAID groups being able to grow. Each top level device in a - zpool is called a vdev, which can be a simple disk or a RAID - transformation such as a mirror or RAID-Z array. ZFS file - systems (called datasets), each have access to the combined - free space of the entire pool. As blocks are allocated the - free space in the pool available to of each file system is - decreased. This approach avoids the common pitfall with - extensive partitioning where free space becomes fragmentated - across the partitions.</para> - - <informaltable pgwide="1"> - <tgroup cols="2"> - <tbody> - <row> - <entry valign="top" - id="filesystems-zfs-term-zpool">zpool</entry> - - <entry>A storage pool is the most basic building block - of ZFS. A pool is made up of one or more vdevs, the - underlying devices that store the data. A pool is - then used to create one or more file systems - (datasets) or block devices (volumes). These datasets - and volumes share the pool of remaining free space. - Each pool is uniquely identified by a name and a - <acronym>GUID</acronym>. The zpool also controls the - version number and therefore the features available - for use with ZFS. - <note><para>&os; 9.0 and 9.1 include - support for ZFS version 28. Future versions use ZFS - version 5000 with feature flags. This allows - greater cross-compatibility with other - implementations of ZFS. - </para></note></entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-vdev">vdev Types</entry> - - <entry>A zpool is made up of one or more vdevs, which - themselves can be a single disk or a group of disks, - in the case of a RAID transform. When multiple vdevs - are used, ZFS spreads data across the vdevs to - increase performance and maximize usable space. - <itemizedlist> - <listitem> - <para id="filesystems-zfs-term-vdev-disk"> - <emphasis>Disk</emphasis> - The most basic type - of vdev is a standard block device. This can be - an entire disk (such as - <devicename><replaceable>/dev/ada0</replaceable></devicename> - or - <devicename><replaceable>/dev/da0</replaceable></devicename>) - or a partition - (<devicename><replaceable>/dev/ada0p3</replaceable></devicename>). - Contrary to the Solaris documentation, on &os; - there is no performance penalty for using a - partition rather than an entire disk.</para> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-file"> - <emphasis>File</emphasis> - In addition to - disks, ZFS pools can be backed by regular files, - this is especially useful for testing and - experimentation. Use the full path to the file - as the device path in the zpool create command. - All vdevs must be atleast 128 MB in - size.</para> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-mirror"> - <emphasis>Mirror</emphasis> - When creating a - mirror, specify the <literal>mirror</literal> - keyword followed by the list of member devices - for the mirror. A mirror consists of two or - more devices, all data will be written to all - member devices. A mirror vdev will only hold as - much data as its smallest member. A mirror vdev - can withstand the failure of all but one of its - members without losing any data.</para> - - <note> - <para> - A regular single disk vdev can be - upgraded to a mirror vdev at any time using - the <command>zpool</command> <link + <para>The Z file system, originally developed by &sun;, + is designed to future proof the file system by removing many of + the arbitrary limits imposed on previous file systems. ZFS + allows continuous growth of the pooled storage by adding + additional devices. ZFS allows you to create many file systems + (in addition to block devices) out of a single shared pool of + storage. Space is allocated as needed, so all remaining free + space is available to each file system in the pool. It is also + designed for maximum data integrity, supporting data snapshots, + multiple copies, and cryptographic checksums. It uses a + software data replication model, known as + <acronym>RAID</acronym>-Z. <acronym>RAID</acronym>-Z provides + redundancy similar to hardware <acronym>RAID</acronym>, but is + designed to prevent data write corruption and to overcome some + of the limitations of hardware <acronym>RAID</acronym>.</para> + + <sect1 id="filesystems-zfs-term"> + <title>ZFS Features and Terminology</title> + + <para>ZFS is a fundamentally different file system because it + is more than just a file system. ZFS combines the roles of + file system and volume manager, enabling additional storage + devices to be added to a live system and having the new space + available on all of the existing file systems in that pool + immediately. By combining the traditionally separate roles, + ZFS is able to overcome previous limitations that prevented + RAID groups being able to grow. Each top level device in a + zpool is called a vdev, which can be a simple disk or a RAID + transformation such as a mirror or RAID-Z array. ZFS file + systems (called datasets), each have access to the combined + free space of the entire pool. As blocks are allocated the + free space in the pool available to of each file system is + decreased. This approach avoids the common pitfall with + extensive partitioning where free space becomes fragmentated + across the partitions.</para> + + <informaltable pgwide="1"> + <tgroup cols="2"> + <tbody> + <row> + <entry valign="top" + id="filesystems-zfs-term-zpool">zpool</entry> + + <entry>A storage pool is the most basic building block of + ZFS. A pool is made up of one or more vdevs, the + underlying devices that store the data. A pool is then + used to create one or more file systems (datasets) or + block devices (volumes). These datasets and volumes + share the pool of remaining free space. Each pool is + uniquely identified by a name and a + <acronym>GUID</acronym>. The zpool also controls the + version number and therefore the features available for + use with ZFS. + + <note> + <para>&os; 9.0 and 9.1 include support for ZFS version + 28. Future versions use ZFS version 5000 with + feature flags. This allows greater + cross-compatibility with other implementations of + ZFS.</para> + </note></entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-vdev">vdev Types</entry> + + <entry>A zpool is made up of one or more vdevs, which + themselves can be a single disk or a group of disks, in + the case of a RAID transform. When multiple vdevs are + used, ZFS spreads data across the vdevs to increase + performance and maximize usable space. + + <itemizedlist> + <listitem> + <para id="filesystems-zfs-term-vdev-disk"> + <emphasis>Disk</emphasis> - The most basic type + of vdev is a standard block device. This can be + an entire disk (such as + <devicename><replaceable>/dev/ada0</replaceable></devicename> + or + <devicename><replaceable>/dev/da0</replaceable></devicename>) + or a partition + (<devicename><replaceable>/dev/ada0p3</replaceable></devicename>). + Contrary to the Solaris documentation, on &os; + there is no performance penalty for using a + partition rather than an entire disk.</para> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-file"> + <emphasis>File</emphasis> - In addition to + disks, ZFS pools can be backed by regular files, + this is especially useful for testing and + experimentation. Use the full path to the file + as the device path in the zpool create command. + All vdevs must be atleast 128 MB in + size.</para> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-mirror"> + <emphasis>Mirror</emphasis> - When creating a + mirror, specify the <literal>mirror</literal> + keyword followed by the list of member devices + for the mirror. A mirror consists of two or + more devices, all data will be written to all + member devices. A mirror vdev will only hold as + much data as its smallest member. A mirror vdev + can withstand the failure of all but one of its + members without losing any data.</para> + + <note> + <para>regular single disk vdev can be upgraded to + a mirror vdev at any time using the + <command>zpool</command> <link linkend="filesystems-zfs-zpool-attach">attach</link> - command.</para> - </note> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-raidz"> - <emphasis><acronym>RAID</acronym>-Z</emphasis> - - ZFS implements RAID-Z, a variation on standard - RAID-5 that offers better distribution of parity - and eliminates the "RAID-5 write hole" in which - the data and parity information become - inconsistent after an unexpected restart. ZFS - supports 3 levels of RAID-Z which provide - varying levels of redundancy in exchange for - decreasing levels of usable storage. The types - are named RAID-Z1 through Z3 based on the number - of parity devinces in the array and the number - of disks that the pool can operate - without.</para> - - <para>In a RAID-Z1 configuration with 4 disks, - each 1 TB, usable storage will be 3 TB - and the pool will still be able to operate in - degraded mode with one faulted disk. If an - additional disk goes offline before the faulted - disk is replaced and resilvered, all data in the - pool can be lost.</para> - - <para>In a RAID-Z3 configuration with 8 disks of - 1 TB, the volume would provide 5TB of - usable space and still be able to operate with - three faulted disks. Sun recommends no more - than 9 disks in a single vdev. If the - configuration has more disks, it is recommended - to divide them into separate vdevs and the pool - data will be striped across them.</para> - - <para>A configuration of 2 RAID-Z2 vdevs - consisting of 8 disks each would create - something similar to a RAID 60 array. A RAID-Z - group's storage capacity is approximately the - size of the smallest disk, multiplied by the - number of non-parity disks. 4x 1 TB disks - in Z1 has an effective size of approximately - 3 TB, and a 8x 1 TB array in Z3 will - yeild 5 TB of usable space.</para> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-spare"> - <emphasis>Spare</emphasis> - ZFS has a special - pseudo-vdev type for keeping track of available - hot spares. Note that installed hot spares are - not deployed automatically; they must manually - be configured to replace the failed device using - the zfs replace command.</para> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-log"> - <emphasis>Log</emphasis> - ZFS Log Devices, also - known as ZFS Intent Log (<acronym>ZIL</acronym>) - move the intent log from the regular pool - devices to a dedicated device. The ZIL - accelerates synchronous transactions by using - storage devices (such as - <acronym>SSD</acronym>s) that are faster - compared to those used for the main pool. When - data is being written and the application - requests a guarantee that the data has been - safely stored, the data is written to the faster - ZIL storage, then later flushed out to the - regular disks, greatly reducing the latency of - synchronous writes. Log devices can be - mirrored, but RAID-Z is not supported. When - specifying multiple log devices writes will be - load balanced across all devices.</para> - </listitem> - - <listitem> - <para id="filesystems-zfs-term-vdev-cache"> - <emphasis>Cache</emphasis> - Adding a cache vdev - to a zpool will add the storage of the cache to - the L2ARC. Cache devices cannot be mirrored. - Since a cache device only stores additional - copies of existing data, there is no risk of - data loss.</para> - </listitem> - </itemizedlist></entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-arc">Adaptive Replacement - Cache (<acronym>ARC</acronym>)</entry> - - <entry>ZFS uses an Adaptive Replacement Cache - (<acronym>ARC</acronym>), rather than a more - traditional Least Recently Used - (<acronym>LRU</acronym>) cache. An - <acronym>LRU</acronym> cache is a simple list of items - in the cache sorted by when each object was most - recently used; new items are added to the top of the - list and once the cache is full items from the bottom - of the list are evicted to make room for more active - objects. An <acronym>ARC</acronym> consists of four - lists; the Most Recently Used (<acronym>MRU</acronym>) - and Most Frequently Used (<acronym>MFU</acronym>) - objects, plus a ghost list for each. These ghost - lists tracks recently evicted objects to provent them - being added back to the cache. This increases the - cache hit ratio by avoiding objects that have a - history of only being used occasionally. Another - advantage of using both an <acronym>MRU</acronym> and - <acronym>MFU</acronym> is that scanning an entire - filesystem would normally evict all data from an - <acronym>MRU</acronym> or <acronym>LRU</acronym> cache - in favor of this freshly accessed content. In the - case of <acronym>ZFS</acronym> since there is also an - <acronym>MFU</acronym> that only tracks the most - frequently used objects, the cache of the most - commonly accessed blocks remains.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-l2arc">L2ARC</entry> - - <entry>The <acronym>L2ARC</acronym> is the second level - of the <acronym>ZFS</acronym> caching system. The - primary <acronym>ARC</acronym> is stored in - <acronym>RAM</acronym>, however since the amount of - available <acronym>RAM</acronym> is often limited, - <acronym>ZFS</acronym> can also make use of <link + command.</para> + </note> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-raidz"> + <emphasis><acronym>RAID</acronym>-Z</emphasis> - + ZFS implements RAID-Z, a variation on standard + RAID-5 that offers better distribution of parity + and eliminates the "RAID-5 write hole" in which + the data and parity information become + inconsistent after an unexpected restart. ZFS + supports 3 levels of RAID-Z which provide + varying levels of redundancy in exchange for + decreasing levels of usable storage. The types + are named RAID-Z1 through Z3 based on the number + of parity devinces in the array and the number + of disks that the pool can operate + without.</para> + + <para>In a RAID-Z1 configuration with 4 disks, + each 1 TB, usable storage will be 3 TB + and the pool will still be able to operate in + degraded mode with one faulted disk. If an + additional disk goes offline before the faulted + disk is replaced and resilvered, all data in the + pool can be lost.</para> + + <para>In a RAID-Z3 configuration with 8 disks of + 1 TB, the volume would provide 5TB of + usable space and still be able to operate with + three faulted disks. Sun recommends no more + than 9 disks in a single vdev. If the + configuration has more disks, it is recommended + to divide them into separate vdevs and the pool + data will be striped across them.</para> + + <para>A configuration of 2 RAID-Z2 vdevs + consisting of 8 disks each would create + something similar to a RAID 60 array. A RAID-Z + group's storage capacity is approximately the + size of the smallest disk, multiplied by the + number of non-parity disks. 4x 1 TB disks + in Z1 has an effective size of approximately + 3 TB, and a 8x 1 TB array in Z3 will + yeild 5 TB of usable space.</para> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-spare"> + <emphasis>Spare</emphasis> - ZFS has a special + pseudo-vdev type for keeping track of available + hot spares. Note that installed hot spares are + not deployed automatically; they must manually + be configured to replace the failed device using + the zfs replace command.</para> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-log"> + <emphasis>Log</emphasis> - ZFS Log Devices, also + known as ZFS Intent Log (<acronym>ZIL</acronym>) + move the intent log from the regular pool + devices to a dedicated device. The ZIL + accelerates synchronous transactions by using + storage devices (such as + <acronym>SSD</acronym>s) that are faster + compared to those used for the main pool. When + data is being written and the application + requests a guarantee that the data has been + safely stored, the data is written to the faster + ZIL storage, then later flushed out to the + regular disks, greatly reducing the latency of + synchronous writes. Log devices can be + mirrored, but RAID-Z is not supported. When + specifying multiple log devices writes will be + load balanced across all devices.</para> + </listitem> + + <listitem> + <para id="filesystems-zfs-term-vdev-cache"> + <emphasis>Cache</emphasis> - Adding a cache vdev + to a zpool will add the storage of the cache to + the L2ARC. Cache devices cannot be mirrored. + Since a cache device only stores additional + copies of existing data, there is no risk of + data loss.</para> + </listitem> + </itemizedlist></entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-arc">Adaptive Replacement + Cache (<acronym>ARC</acronym>)</entry> + + <entry>ZFS uses an Adaptive Replacement Cache + (<acronym>ARC</acronym>), rather than a more + traditional Least Recently Used + (<acronym>LRU</acronym>) cache. An + <acronym>LRU</acronym> cache is a simple list of items + in the cache sorted by when each object was most + recently used; new items are added to the top of the + list and once the cache is full items from the bottom + of the list are evicted to make room for more active + objects. An <acronym>ARC</acronym> consists of four + lists; the Most Recently Used (<acronym>MRU</acronym>) + and Most Frequently Used (<acronym>MFU</acronym>) + objects, plus a ghost list for each. These ghost + lists tracks recently evicted objects to provent them + being added back to the cache. This increases the + cache hit ratio by avoiding objects that have a + history of only being used occasionally. Another + advantage of using both an <acronym>MRU</acronym> and + <acronym>MFU</acronym> is that scanning an entire + filesystem would normally evict all data from an + <acronym>MRU</acronym> or <acronym>LRU</acronym> cache + in favor of this freshly accessed content. In the + case of <acronym>ZFS</acronym> since there is also an + <acronym>MFU</acronym> that only tracks the most + frequently used objects, the cache of the most + commonly accessed blocks remains.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-l2arc">L2ARC</entry> + + <entry>The <acronym>L2ARC</acronym> is the second level + of the <acronym>ZFS</acronym> caching system. The + primary <acronym>ARC</acronym> is stored in + <acronym>RAM</acronym>, however since the amount of + available <acronym>RAM</acronym> is often limited, + <acronym>ZFS</acronym> can also make use of <link linkend="filesystems-zfs-term-vdev-cache">cache</link> - vdevs. Solid State Disks (<acronym>SSD</acronym>s) - are often used as these cache devices due to their - higher speed and lower latency compared to traditional - spinning disks. An L2ARC is entirely optional, but - having one will significantly increase read speeds for - files that are cached on the <acronym>SSD</acronym> - instead of having to be read from the regular spinning - disks. The L2ARC can also speed up <link + vdevs. Solid State Disks (<acronym>SSD</acronym>s) are + often used as these cache devices due to their higher + speed and lower latency compared to traditional spinning + disks. An L2ARC is entirely optional, but having one + will significantly increase read speeds for files that + are cached on the <acronym>SSD</acronym> instead of + having to be read from the regular spinning disks. The + L2ARC can also speed up <link linkend="filesystems-zfs-term-deduplication">deduplication</link> - since a <acronym>DDT</acronym> that does not fit in - <acronym>RAM</acronym> but does fit in the - <acronym>L2ARC</acronym> will be much faster than if - the <acronym>DDT</acronym> had to be read from disk. - The rate at which data is added to the cache devices - is limited to prevent prematurely wearing out the - <acronym>SSD</acronym> with too many writes. Until - the cache is full (the first block has been evicted to - make room), writing to the <acronym>L2ARC</acronym> is - limited to the sum of the write limit and the boost - limit, then after that limited to the write limit. A - pair of sysctl values control these rate limits; - <literal>vfs.zfs.l2arc_write_max</literal> controls - how many bytes are written to the cache per second, - while <literal>vfs.zfs.l2arc_write_boost</literal> - adds to this limit during the "Turbo Warmup Phase" - (Write Boost).</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-cow">Copy-On-Write</entry> - - <entry>Unlike a traditional file system, when data is - overwritten on ZFS the new data is written to a - different block rather than overwriting the old data - in place. Only once this write is complete is the - metadata then updated to point to the new location of - the data. This means that in the event of a shorn - write (a system crash or power loss in the middle of - writing a file) the entire original contents of the - file are still available and the incomplete write is - discarded. This also means that ZFS does not require - a fsck after an unexpected shutdown.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-dataset">Dataset</entry> - - <entry>Dataset is the generic term for a ZFS file - system, volume, snapshot or clone. Each dataset will - have a unique name in the format: - <literal>poolname/path@snapshot</literal>. The root - of the pool is technically a dataset as well. Child - datasets are named hierarchically like directories; - for example <literal>mypool/home</literal>, the home - dataset is a child of mypool and inherits properties - from it. This can be expended further by creating - <literal>mypool/home/user</literal>. This grandchild - dataset will inherity properties from the parent and - grandparent. It is also possible to set properties - on a child to override the defaults inherited from the - parents and grandparents. ZFS also allows - administration of datasets and their children to be - delegated.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-volum">Volume</entry> - - <entry>In additional to regular file system datasets, - ZFS can also create volumes, which are block devices. - Volumes have many of the same features, including - copy-on-write, snapshots, clones and - checksumming. Volumes can be useful for running other - file system formats on top of ZFS, such as UFS or in - the case of Virtualization or exporting - <acronym>iSCSI</acronym> extents.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-snapshot">Snapshot</entry> - - <entry>The <link - linkend="filesystems-zfs-term-cow">copy-on-write</link> - design of ZFS allows for nearly instantaneous - consistent snapshots with arbitrary names. After - taking a snapshot of a dataset (or a recursive - snapshot of a parent dataset that will include all - child datasets), new data is written to new blocks (as - described above), however the old blocks are not - reclaimed as free space. There are then two versions - of the file system, the snapshot (what the file system - looked like before) and the live file system; however - no additional space is used. As new data is written - to the live file system, new blocks are allocated to - store this data. The apparent size of the snapshot - will grow as the blocks are no longer used in the live - file system, but only in the snapshot. These - snapshots can be mounted (read only) to allow for the - recovery of previous versions of files. It is also - possible to <link + since a <acronym>DDT</acronym> that does not fit in + <acronym>RAM</acronym> but does fit in the + <acronym>L2ARC</acronym> will be much faster than if the + <acronym>DDT</acronym> had to be read from disk. The + rate at which data is added to the cache devices is + limited to prevent prematurely wearing out the + <acronym>SSD</acronym> with too many writes. Until the + cache is full (the first block has been evicted to make + room), writing to the <acronym>L2ARC</acronym> is + limited to the sum of the write limit and the boost + limit, then after that limited to the write limit. A + pair of sysctl values control these rate limits; + <literal>vfs.zfs.l2arc_write_max</literal> controls how + many bytes are written to the cache per second, while + <literal>vfs.zfs.l2arc_write_boost</literal> adds to + this limit during the "Turbo Warmup Phase" (Write + Boost).</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-cow">Copy-On-Write</entry> + + <entry>Unlike a traditional file system, when data is + overwritten on ZFS the new data is written to a + different block rather than overwriting the old data in + place. Only once this write is complete is the metadata + then updated to point to the new location of the data. + This means that in the event of a shorn write (a system + crash or power loss in the middle of writing a file) the + entire original contents of the file are still available + and the incomplete write is discarded. This also means + that ZFS does not require a fsck after an unexpected + shutdown.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-dataset">Dataset</entry> + + <entry>Dataset is the generic term for a ZFS file system, + volume, snapshot or clone. Each dataset will have a + unique name in the format: + <literal>poolname/path@snapshot</literal>. The root of + the pool is technically a dataset as well. Child + datasets are named hierarchically like directories; for + example <literal>mypool/home</literal>, the home dataset + is a child of mypool and inherits properties from it. + This can be expended further by creating + <literal>mypool/home/user</literal>. This grandchild + dataset will inherity properties from the parent and + grandparent. It is also possible to set properties + on a child to override the defaults inherited from the + parents and grandparents. ZFS also allows + administration of datasets and their children to be + delegated.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-volum">Volume</entry> + + <entry>In additional to regular file system datasets, ZFS + can also create volumes, which are block devices. + Volumes have many of the same features, including + copy-on-write, snapshots, clones and checksumming. + Volumes can be useful for running other file system + formats on top of ZFS, such as UFS or in the case of + Virtualization or exporting <acronym>iSCSI</acronym> + extents.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-snapshot">Snapshot</entry> + + <entry>The <link + linkend="filesystems-zfs-term-cow">copy-on-write</link> + + design of ZFS allows for nearly instantaneous consistent + snapshots with arbitrary names. After taking a snapshot + of a dataset (or a recursive snapshot of a parent + dataset that will include all child datasets), new data + is written to new blocks (as described above), however + the old blocks are not reclaimed as free space. There + are then two versions of the file system, the snapshot + (what the file system looked like before) and the live + file system; however no additional space is used. As + new data is written to the live file system, new blocks + are allocated to store this data. The apparent size of + the snapshot will grow as the blocks are no longer used + in the live file system, but only in the snapshot. + These snapshots can be mounted (read only) to allow for + the recovery of previous versions of files. It is also + possible to <link linkend="filesystems-zfs-zfs-snapshot">rollback</link> - a live file system to a specific snapshot, undoing any - changes that took place after the snapshot was taken. - Each block in the zpool has a reference counter which - indicates how many snapshots, clones, datasets or - volumes make use of that block. As files and - snapshots are deleted, the reference count is - decremented; once a block is no longer referenced, it - is reclaimed as free space. Snapshots can also be - marked with a <link + a live file system to a specific snapshot, undoing any + changes that took place after the snapshot was taken. + Each block in the zpool has a reference counter which + indicates how many snapshots, clones, datasets or + volumes make use of that block. As files and snapshots + are deleted, the reference count is decremented; once a + block is no longer referenced, it is reclaimed as free + space. Snapshots can also be marked with a <link linkend="filesystems-zfs-zfs-snapshot">hold</link>, - once a snapshot is held, any attempt to destroy it - will return an EBUY error. Each snapshot can have - multiple holds, each with a unique name. The <link + once a snapshot is held, any attempt to destroy it will + return an EBUY error. Each snapshot can have multiple + holds, each with a unique name. The <link linkend="filesystems-zfs-zfs-snapshot">release</link> - command removes the hold so the snapshot can then be - deleted. Snapshots can be taken on volumes, however - they can only be cloned or rolled back, not mounted - independently.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-clone">Clone</entry> - - <entry>Snapshots can also be cloned; a clone is a - writable version of a snapshot, allowing the file - system to be forked as a new dataset. As with a - snapshot, a clone initially consumes no additional - space, only as new data is written to a clone and new - blocks are allocated does the apparent size of the - clone grow. As blocks are overwritten in the cloned - file system or volume, the reference count on the - previous block is decremented. The snapshot upon - which a clone is based cannot be deleted because the - clone is dependeant upon it (the snapshot is the - parent, and the clone is the child). Clones can be - <literal>promoted</literal>, reversing this - dependeancy, making the clone the parent and the - previous parent the child. This operation requires no - additional space, however it will change the way the - used space is accounted.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-checksum">Checksum</entry> - - <entry>Every block that is allocated is also checksummed - (which algorithm is used is a per dataset property, - see: zfs set). ZFS transparently validates the - checksum of each block as it is read, allowing ZFS to - detect silent corruption. If the data that is read - does not match the expected checksum, ZFS will attempt - to recover the data from any available redundancy - (mirrors, RAID-Z). You can trigger the validation of - all checksums using the <link - linkend="filesystems-zfs-term-scrub">scrub</link> - command. The available checksum algorithms include: - <itemizedlist> - <listitem><para>fletcher2</para></listitem> - <listitem><para>fletcher4</para></listitem> - <listitem><para>sha256</para></listitem> - </itemizedlist> The fletcher algorithms are faster, - but sha256 is a strong cryptographic hash and has a - much lower chance of a collisions at the cost of some - performance. Checksums can be disabled but it is - inadvisable.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-compression">Compression</entry> - - <entry>Each dataset in ZFS has a compression property, - which defaults to off. This property can be set to - one of a number of compression algorithms, which will - cause all new data that is written to this dataset to - be compressed as it is written. In addition to the - reduction in disk usage, this can also increase read - and write throughput, as only the smaller compressed - version of the file needs to be read or - written.<note> - <para>LZ4 compression is only available after &os; - 9.2</para> - </note></entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-deduplication">Deduplication</entry> - - <entry>ZFS has the ability to detect duplicate blocks of - data as they are written (thanks to the checksumming - feature). If deduplication is enabled, instead of - writing the block a second time, the reference count - of the existing block will be increased, saving - storage space. In order to do this, ZFS keeps a - deduplication table (<acronym>DDT</acronym>) in - memory, containing the list of unique checksums, the - location of that block and a reference count. When - new data is written, the checksum is calculated and - compared to the list. If a match is found, the data - is considered to be a duplicate. When deduplication - is enabled, the checksum algorithm is changed to - <acronym>SHA256</acronym> to provide a secure - cryptographic hash. ZFS deduplication is tunable; if - dedup is on, then a matching checksum is assumed to - mean that the data is identical. If dedup is set to - verify, then the data in the two blocks will be - checked byte-for-byte to ensure it is actually - identical and if it is not, the hash collision will be - noted by ZFS and the two blocks will be stored - separately. Due to the nature of the - <acronym>DDT</acronym>, having to store the hash of - each unique block, it consumes a very large amount of - memory (a general rule of thumb is 5-6 GB of ram - per 1 TB of deduplicated data). In situations - where it is not practical to have enough - <acronym>RAM</acronym> to keep the entire DDT in - memory, performance will suffer greatly as the DDT - will need to be read from disk before each new block - is written. Deduplication can make use of the L2ARC - to store the DDT, providing a middle ground between - fast system memory and slower disks. It is advisable - to consider using ZFS compression instead, which often - provides nearly as much space savings without the - additional memory requirement.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-scrub">Scrub</entry> - - <entry>In place of a consistency check like fsck, ZFS - has the <literal>scrub</literal> command, which reads - all data blocks stored on the pool and verifies their - checksums them against the known good checksums stored - in the metadata. This periodic check of all the data - stored on the pool ensures the recovery of any - corrupted blocks before they are needed. A scrub is - not required after an unclean shutdown, but it is - recommended that you run a scrub at least once each - quarter. ZFS compares the checksum for each block as - it is read in the normal course of use, but a scrub - operation makes sure even infrequently used blocks are - checked for silent corruption.</entry> - </row> - - <row> - <entry valign="top" - id="filesystems-zfs-term-quota">Dataset Quota</entry> - - <entry>ZFS provides very fast and accurate dataset, user - and group space accounting in addition to quotes and - space reservations. This gives the administrator fine - grained control over how space is allocated and allows - critical file systems to reserve space to ensure other - file systems do not take all of the free space. - <para>ZFS supports different types of quotas: the - dataset quota, the <link + command removes the hold so the snapshot can then be + deleted. Snapshots can be taken on volumes, however + they can only be cloned or rolled back, not mounted + independently.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-clone">Clone</entry> + + <entry>Snapshots can also be cloned; a clone is a writable + version of a snapshot, allowing the file system to be + forked as a new dataset. As with a snapshot, a clone + initially consumes no additional space, only as new data + is written to a clone and new blocks are allocated does + the apparent size of the clone grow. As blocks are + overwritten in the cloned file system or volume, the + reference count on the previous block is decremented. + The snapshot upon which a clone is based cannot be + deleted because the clone is dependeant upon it (the + snapshot is the parent, and the clone is the child). + Clones can be <literal>promoted</literal>, reversing + this dependeancy, making the clone the parent and the + previous parent the child. This operation requires no + additional space, however it will change the way the + used space is accounted.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-checksum">Checksum</entry> + + <entry>Every block that is allocated is also checksummed + (which algorithm is used is a per dataset property, see: + zfs set). ZFS transparently validates the checksum of + each block as it is read, allowing ZFS to detect silent + corruption. If the data that is read does not match the + expected checksum, ZFS will attempt to recover the data + from any available redundancy (mirrors, RAID-Z). You + can trigger the validation of all checksums using the + <link linkend="filesystems-zfs-term-scrub">scrub</link> + command. The available checksum algorithms include: + + <itemizedlist> + <listitem> + <para>fletcher2</para> + </listitem> + + <listitem> + <para>fletcher4</para> + </listitem> + + <listitem> + <para>sha256</para> + </listitem> + </itemizedlist> + + The fletcher algorithms are faster, but sha256 is a + strong cryptographic hash and has a much lower chance of + a collisions at the cost of some performance. Checksums + can be disabled but it is inadvisable.</entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-compression">Compression</entry> + + <entry>Each dataset in ZFS has a compression property, + which defaults to off. This property can be set to one + of a number of compression algorithms, which will cause + all new data that is written to this dataset to be + compressed as it is written. In addition to the + reduction in disk usage, this can also increase read and + write throughput, as only the smaller compressed version + of the file needs to be read or written. + + <note> + <para>LZ4 compression is only available after &os; + 9.2</para> + </note></entry> + </row> + + <row> + <entry valign="top" + id="filesystems-zfs-term-deduplication">Deduplication</entry> + + <entry>ZFS has the ability to detect duplicate blocks of + data as they are written (thanks to the checksumming + feature). If deduplication is enabled, instead of + writing the block a second time, the reference count of + the existing block will be increased, saving storage + space. In order to do this, ZFS keeps a deduplication + table (<acronym>DDT</acronym>) in memory, containing the + list of unique checksums, the location of that block and + a reference count. When new data is written, the + checksum is calculated and compared to the list. If a + match is found, the data is considered to be a + duplicate. When deduplication is enabled, the checksum + algorithm is changed to <acronym>SHA256</acronym> to + provide a secure cryptographic hash. ZFS deduplication + is tunable; if dedup is on, then a matching checksum is + assumed to mean that the data is identical. If dedup is + set to verify, then the data in the two blocks will be + checked byte-for-byte to ensure it is actually identical + and if it is not, the hash collision will be noted by + ZFS and the two blocks will be stored separately. Due + to the nature of the <acronym>DDT</acronym>, having to + store the hash of each unique block, it consumes a very + large amount of memory (a general rule of thumb is + 5-6 GB of ram per 1 TB of deduplicated data). + In situations where it is not practical to have enough + <acronym>RAM</acronym> to keep the entire DDT in memory, *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201308142334.r7ENYGR9021849>