From owner-svn-doc-projects@FreeBSD.ORG Wed Aug 14 23:34:16 2013 Return-Path: Delivered-To: svn-doc-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 88C568C4; Wed, 14 Aug 2013 23:34:16 +0000 (UTC) (envelope-from wblock@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 734D52F67; Wed, 14 Aug 2013 23:34:16 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id r7ENYGYA021850; Wed, 14 Aug 2013 23:34:16 GMT (envelope-from wblock@svn.freebsd.org) Received: (from wblock@localhost) by svn.freebsd.org (8.14.7/8.14.5/Submit) id r7ENYGR9021849; Wed, 14 Aug 2013 23:34:16 GMT (envelope-from wblock@svn.freebsd.org) Message-Id: <201308142334.r7ENYGR9021849@svn.freebsd.org> From: Warren Block Date: Wed, 14 Aug 2013 23:34:16 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-projects@freebsd.org Subject: svn commit: r42542 - projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs X-SVN-Group: doc-projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-projects@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for doc projects trees List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 14 Aug 2013 23:34:16 -0000 Author: wblock Date: Wed Aug 14 23:34:16 2013 New Revision: 42542 URL: http://svnweb.freebsd.org/changeset/doc/42542 Log: Whitespace-only fixes. Translators, please ignore. Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml ============================================================================== --- projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Wed Aug 14 22:29:07 2013 (r42541) +++ projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Wed Aug 14 23:34:16 2013 (r42542) @@ -15,723 +15,729 @@ - The Z File System (ZFS) + The Z File System (ZFS) - The Z file system, originally developed by &sun;, - is designed to future proof the file system by removing many of - the arbitrary limits imposed on previous file systems. ZFS - allows continuous growth of the pooled storage by adding - additional devices. ZFS allows you to create many file systems - (in addition to block devices) out of a single shared pool of - storage. Space is allocated as needed, so all remaining free - space is available to each file system in the pool. It is also - designed for maximum data integrity, supporting data snapshots, - multiple copies, and cryptographic checksums. It uses a - software data replication model, known as - RAID-Z. RAID-Z provides - redundancy similar to hardware RAID, but is - designed to prevent data write corruption and to overcome some - of the limitations of hardware RAID. - - - ZFS Features and Terminology - - ZFS is a fundamentally different file system because it - is more than just a file system. ZFS combines the roles of - file system and volume manager, enabling additional storage - devices to be added to a live system and having the new space - available on all of the existing file systems in that pool - immediately. By combining the traditionally separate roles, - ZFS is able to overcome previous limitations that prevented - RAID groups being able to grow. Each top level device in a - zpool is called a vdev, which can be a simple disk or a RAID - transformation such as a mirror or RAID-Z array. ZFS file - systems (called datasets), each have access to the combined - free space of the entire pool. As blocks are allocated the - free space in the pool available to of each file system is - decreased. This approach avoids the common pitfall with - extensive partitioning where free space becomes fragmentated - across the partitions. - - - - - - zpool - - A storage pool is the most basic building block - of ZFS. A pool is made up of one or more vdevs, the - underlying devices that store the data. A pool is - then used to create one or more file systems - (datasets) or block devices (volumes). These datasets - and volumes share the pool of remaining free space. - Each pool is uniquely identified by a name and a - GUID. The zpool also controls the - version number and therefore the features available - for use with ZFS. - &os; 9.0 and 9.1 include - support for ZFS version 28. Future versions use ZFS - version 5000 with feature flags. This allows - greater cross-compatibility with other - implementations of ZFS. - - - - - vdev Types - - A zpool is made up of one or more vdevs, which - themselves can be a single disk or a group of disks, - in the case of a RAID transform. When multiple vdevs - are used, ZFS spreads data across the vdevs to - increase performance and maximize usable space. - - - - Disk - The most basic type - of vdev is a standard block device. This can be - an entire disk (such as - /dev/ada0 - or - /dev/da0) - or a partition - (/dev/ada0p3). - Contrary to the Solaris documentation, on &os; - there is no performance penalty for using a - partition rather than an entire disk. - - - - - File - In addition to - disks, ZFS pools can be backed by regular files, - this is especially useful for testing and - experimentation. Use the full path to the file - as the device path in the zpool create command. - All vdevs must be atleast 128 MB in - size. - - - - - Mirror - When creating a - mirror, specify the mirror - keyword followed by the list of member devices - for the mirror. A mirror consists of two or - more devices, all data will be written to all - member devices. A mirror vdev will only hold as - much data as its smallest member. A mirror vdev - can withstand the failure of all but one of its - members without losing any data. - - - - A regular single disk vdev can be - upgraded to a mirror vdev at any time using - the zpool The Z file system, originally developed by &sun;, + is designed to future proof the file system by removing many of + the arbitrary limits imposed on previous file systems. ZFS + allows continuous growth of the pooled storage by adding + additional devices. ZFS allows you to create many file systems + (in addition to block devices) out of a single shared pool of + storage. Space is allocated as needed, so all remaining free + space is available to each file system in the pool. It is also + designed for maximum data integrity, supporting data snapshots, + multiple copies, and cryptographic checksums. It uses a + software data replication model, known as + RAID-Z. RAID-Z provides + redundancy similar to hardware RAID, but is + designed to prevent data write corruption and to overcome some + of the limitations of hardware RAID. + + + ZFS Features and Terminology + + ZFS is a fundamentally different file system because it + is more than just a file system. ZFS combines the roles of + file system and volume manager, enabling additional storage + devices to be added to a live system and having the new space + available on all of the existing file systems in that pool + immediately. By combining the traditionally separate roles, + ZFS is able to overcome previous limitations that prevented + RAID groups being able to grow. Each top level device in a + zpool is called a vdev, which can be a simple disk or a RAID + transformation such as a mirror or RAID-Z array. ZFS file + systems (called datasets), each have access to the combined + free space of the entire pool. As blocks are allocated the + free space in the pool available to of each file system is + decreased. This approach avoids the common pitfall with + extensive partitioning where free space becomes fragmentated + across the partitions. + + + + + + zpool + + A storage pool is the most basic building block of + ZFS. A pool is made up of one or more vdevs, the + underlying devices that store the data. A pool is then + used to create one or more file systems (datasets) or + block devices (volumes). These datasets and volumes + share the pool of remaining free space. Each pool is + uniquely identified by a name and a + GUID. The zpool also controls the + version number and therefore the features available for + use with ZFS. + + + &os; 9.0 and 9.1 include support for ZFS version + 28. Future versions use ZFS version 5000 with + feature flags. This allows greater + cross-compatibility with other implementations of + ZFS. + + + + + vdev Types + + A zpool is made up of one or more vdevs, which + themselves can be a single disk or a group of disks, in + the case of a RAID transform. When multiple vdevs are + used, ZFS spreads data across the vdevs to increase + performance and maximize usable space. + + + + + Disk - The most basic type + of vdev is a standard block device. This can be + an entire disk (such as + /dev/ada0 + or + /dev/da0) + or a partition + (/dev/ada0p3). + Contrary to the Solaris documentation, on &os; + there is no performance penalty for using a + partition rather than an entire disk. + + + + + File - In addition to + disks, ZFS pools can be backed by regular files, + this is especially useful for testing and + experimentation. Use the full path to the file + as the device path in the zpool create command. + All vdevs must be atleast 128 MB in + size. + + + + + Mirror - When creating a + mirror, specify the mirror + keyword followed by the list of member devices + for the mirror. A mirror consists of two or + more devices, all data will be written to all + member devices. A mirror vdev will only hold as + much data as its smallest member. A mirror vdev + can withstand the failure of all but one of its + members without losing any data. + + + regular single disk vdev can be upgraded to + a mirror vdev at any time using the + zpool attach - command. - - - - - - RAID-Z - - ZFS implements RAID-Z, a variation on standard - RAID-5 that offers better distribution of parity - and eliminates the "RAID-5 write hole" in which - the data and parity information become - inconsistent after an unexpected restart. ZFS - supports 3 levels of RAID-Z which provide - varying levels of redundancy in exchange for - decreasing levels of usable storage. The types - are named RAID-Z1 through Z3 based on the number - of parity devinces in the array and the number - of disks that the pool can operate - without. - - In a RAID-Z1 configuration with 4 disks, - each 1 TB, usable storage will be 3 TB - and the pool will still be able to operate in - degraded mode with one faulted disk. If an - additional disk goes offline before the faulted - disk is replaced and resilvered, all data in the - pool can be lost. - - In a RAID-Z3 configuration with 8 disks of - 1 TB, the volume would provide 5TB of - usable space and still be able to operate with - three faulted disks. Sun recommends no more - than 9 disks in a single vdev. If the - configuration has more disks, it is recommended - to divide them into separate vdevs and the pool - data will be striped across them. - - A configuration of 2 RAID-Z2 vdevs - consisting of 8 disks each would create - something similar to a RAID 60 array. A RAID-Z - group's storage capacity is approximately the - size of the smallest disk, multiplied by the - number of non-parity disks. 4x 1 TB disks - in Z1 has an effective size of approximately - 3 TB, and a 8x 1 TB array in Z3 will - yeild 5 TB of usable space. - - - - - Spare - ZFS has a special - pseudo-vdev type for keeping track of available - hot spares. Note that installed hot spares are - not deployed automatically; they must manually - be configured to replace the failed device using - the zfs replace command. - - - - - Log - ZFS Log Devices, also - known as ZFS Intent Log (ZIL) - move the intent log from the regular pool - devices to a dedicated device. The ZIL - accelerates synchronous transactions by using - storage devices (such as - SSDs) that are faster - compared to those used for the main pool. When - data is being written and the application - requests a guarantee that the data has been - safely stored, the data is written to the faster - ZIL storage, then later flushed out to the - regular disks, greatly reducing the latency of - synchronous writes. Log devices can be - mirrored, but RAID-Z is not supported. When - specifying multiple log devices writes will be - load balanced across all devices. - - - - - Cache - Adding a cache vdev - to a zpool will add the storage of the cache to - the L2ARC. Cache devices cannot be mirrored. - Since a cache device only stores additional - copies of existing data, there is no risk of - data loss. - - - - - - Adaptive Replacement - Cache (ARC) - - ZFS uses an Adaptive Replacement Cache - (ARC), rather than a more - traditional Least Recently Used - (LRU) cache. An - LRU cache is a simple list of items - in the cache sorted by when each object was most - recently used; new items are added to the top of the - list and once the cache is full items from the bottom - of the list are evicted to make room for more active - objects. An ARC consists of four - lists; the Most Recently Used (MRU) - and Most Frequently Used (MFU) - objects, plus a ghost list for each. These ghost - lists tracks recently evicted objects to provent them - being added back to the cache. This increases the - cache hit ratio by avoiding objects that have a - history of only being used occasionally. Another - advantage of using both an MRU and - MFU is that scanning an entire - filesystem would normally evict all data from an - MRU or LRU cache - in favor of this freshly accessed content. In the - case of ZFS since there is also an - MFU that only tracks the most - frequently used objects, the cache of the most - commonly accessed blocks remains. - - - - L2ARC - - The L2ARC is the second level - of the ZFS caching system. The - primary ARC is stored in - RAM, however since the amount of - available RAM is often limited, - ZFS can also make use of + + + + + + RAID-Z - + ZFS implements RAID-Z, a variation on standard + RAID-5 that offers better distribution of parity + and eliminates the "RAID-5 write hole" in which + the data and parity information become + inconsistent after an unexpected restart. ZFS + supports 3 levels of RAID-Z which provide + varying levels of redundancy in exchange for + decreasing levels of usable storage. The types + are named RAID-Z1 through Z3 based on the number + of parity devinces in the array and the number + of disks that the pool can operate + without. + + In a RAID-Z1 configuration with 4 disks, + each 1 TB, usable storage will be 3 TB + and the pool will still be able to operate in + degraded mode with one faulted disk. If an + additional disk goes offline before the faulted + disk is replaced and resilvered, all data in the + pool can be lost. + + In a RAID-Z3 configuration with 8 disks of + 1 TB, the volume would provide 5TB of + usable space and still be able to operate with + three faulted disks. Sun recommends no more + than 9 disks in a single vdev. If the + configuration has more disks, it is recommended + to divide them into separate vdevs and the pool + data will be striped across them. + + A configuration of 2 RAID-Z2 vdevs + consisting of 8 disks each would create + something similar to a RAID 60 array. A RAID-Z + group's storage capacity is approximately the + size of the smallest disk, multiplied by the + number of non-parity disks. 4x 1 TB disks + in Z1 has an effective size of approximately + 3 TB, and a 8x 1 TB array in Z3 will + yeild 5 TB of usable space. + + + + + Spare - ZFS has a special + pseudo-vdev type for keeping track of available + hot spares. Note that installed hot spares are + not deployed automatically; they must manually + be configured to replace the failed device using + the zfs replace command. + + + + + Log - ZFS Log Devices, also + known as ZFS Intent Log (ZIL) + move the intent log from the regular pool + devices to a dedicated device. The ZIL + accelerates synchronous transactions by using + storage devices (such as + SSDs) that are faster + compared to those used for the main pool. When + data is being written and the application + requests a guarantee that the data has been + safely stored, the data is written to the faster + ZIL storage, then later flushed out to the + regular disks, greatly reducing the latency of + synchronous writes. Log devices can be + mirrored, but RAID-Z is not supported. When + specifying multiple log devices writes will be + load balanced across all devices. + + + + + Cache - Adding a cache vdev + to a zpool will add the storage of the cache to + the L2ARC. Cache devices cannot be mirrored. + Since a cache device only stores additional + copies of existing data, there is no risk of + data loss. + + + + + + Adaptive Replacement + Cache (ARC) + + ZFS uses an Adaptive Replacement Cache + (ARC), rather than a more + traditional Least Recently Used + (LRU) cache. An + LRU cache is a simple list of items + in the cache sorted by when each object was most + recently used; new items are added to the top of the + list and once the cache is full items from the bottom + of the list are evicted to make room for more active + objects. An ARC consists of four + lists; the Most Recently Used (MRU) + and Most Frequently Used (MFU) + objects, plus a ghost list for each. These ghost + lists tracks recently evicted objects to provent them + being added back to the cache. This increases the + cache hit ratio by avoiding objects that have a + history of only being used occasionally. Another + advantage of using both an MRU and + MFU is that scanning an entire + filesystem would normally evict all data from an + MRU or LRU cache + in favor of this freshly accessed content. In the + case of ZFS since there is also an + MFU that only tracks the most + frequently used objects, the cache of the most + commonly accessed blocks remains. + + + + L2ARC + + The L2ARC is the second level + of the ZFS caching system. The + primary ARC is stored in + RAM, however since the amount of + available RAM is often limited, + ZFS can also make use of cache - vdevs. Solid State Disks (SSDs) - are often used as these cache devices due to their - higher speed and lower latency compared to traditional - spinning disks. An L2ARC is entirely optional, but - having one will significantly increase read speeds for - files that are cached on the SSD - instead of having to be read from the regular spinning - disks. The L2ARC can also speed up SSDs) are + often used as these cache devices due to their higher + speed and lower latency compared to traditional spinning + disks. An L2ARC is entirely optional, but having one + will significantly increase read speeds for files that + are cached on the SSD instead of + having to be read from the regular spinning disks. The + L2ARC can also speed up deduplication - since a DDT that does not fit in - RAM but does fit in the - L2ARC will be much faster than if - the DDT had to be read from disk. - The rate at which data is added to the cache devices - is limited to prevent prematurely wearing out the - SSD with too many writes. Until - the cache is full (the first block has been evicted to - make room), writing to the L2ARC is - limited to the sum of the write limit and the boost - limit, then after that limited to the write limit. A - pair of sysctl values control these rate limits; - vfs.zfs.l2arc_write_max controls - how many bytes are written to the cache per second, - while vfs.zfs.l2arc_write_boost - adds to this limit during the "Turbo Warmup Phase" - (Write Boost). - - - - Copy-On-Write - - Unlike a traditional file system, when data is - overwritten on ZFS the new data is written to a - different block rather than overwriting the old data - in place. Only once this write is complete is the - metadata then updated to point to the new location of - the data. This means that in the event of a shorn - write (a system crash or power loss in the middle of - writing a file) the entire original contents of the - file are still available and the incomplete write is - discarded. This also means that ZFS does not require - a fsck after an unexpected shutdown. - - - - Dataset - - Dataset is the generic term for a ZFS file - system, volume, snapshot or clone. Each dataset will - have a unique name in the format: - poolname/path@snapshot. The root - of the pool is technically a dataset as well. Child - datasets are named hierarchically like directories; - for example mypool/home, the home - dataset is a child of mypool and inherits properties - from it. This can be expended further by creating - mypool/home/user. This grandchild - dataset will inherity properties from the parent and - grandparent. It is also possible to set properties - on a child to override the defaults inherited from the - parents and grandparents. ZFS also allows - administration of datasets and their children to be - delegated. - - - - Volume - - In additional to regular file system datasets, - ZFS can also create volumes, which are block devices. - Volumes have many of the same features, including - copy-on-write, snapshots, clones and - checksumming. Volumes can be useful for running other - file system formats on top of ZFS, such as UFS or in - the case of Virtualization or exporting - iSCSI extents. - - - - Snapshot - - The copy-on-write - design of ZFS allows for nearly instantaneous - consistent snapshots with arbitrary names. After - taking a snapshot of a dataset (or a recursive - snapshot of a parent dataset that will include all - child datasets), new data is written to new blocks (as - described above), however the old blocks are not - reclaimed as free space. There are then two versions - of the file system, the snapshot (what the file system - looked like before) and the live file system; however - no additional space is used. As new data is written - to the live file system, new blocks are allocated to - store this data. The apparent size of the snapshot - will grow as the blocks are no longer used in the live - file system, but only in the snapshot. These - snapshots can be mounted (read only) to allow for the - recovery of previous versions of files. It is also - possible to DDT that does not fit in + RAM but does fit in the + L2ARC will be much faster than if the + DDT had to be read from disk. The + rate at which data is added to the cache devices is + limited to prevent prematurely wearing out the + SSD with too many writes. Until the + cache is full (the first block has been evicted to make + room), writing to the L2ARC is + limited to the sum of the write limit and the boost + limit, then after that limited to the write limit. A + pair of sysctl values control these rate limits; + vfs.zfs.l2arc_write_max controls how + many bytes are written to the cache per second, while + vfs.zfs.l2arc_write_boost adds to + this limit during the "Turbo Warmup Phase" (Write + Boost). + + + + Copy-On-Write + + Unlike a traditional file system, when data is + overwritten on ZFS the new data is written to a + different block rather than overwriting the old data in + place. Only once this write is complete is the metadata + then updated to point to the new location of the data. + This means that in the event of a shorn write (a system + crash or power loss in the middle of writing a file) the + entire original contents of the file are still available + and the incomplete write is discarded. This also means + that ZFS does not require a fsck after an unexpected + shutdown. + + + + Dataset + + Dataset is the generic term for a ZFS file system, + volume, snapshot or clone. Each dataset will have a + unique name in the format: + poolname/path@snapshot. The root of + the pool is technically a dataset as well. Child + datasets are named hierarchically like directories; for + example mypool/home, the home dataset + is a child of mypool and inherits properties from it. + This can be expended further by creating + mypool/home/user. This grandchild + dataset will inherity properties from the parent and + grandparent. It is also possible to set properties + on a child to override the defaults inherited from the + parents and grandparents. ZFS also allows + administration of datasets and their children to be + delegated. + + + + Volume + + In additional to regular file system datasets, ZFS + can also create volumes, which are block devices. + Volumes have many of the same features, including + copy-on-write, snapshots, clones and checksumming. + Volumes can be useful for running other file system + formats on top of ZFS, such as UFS or in the case of + Virtualization or exporting iSCSI + extents. + + + + Snapshot + + The copy-on-write + + design of ZFS allows for nearly instantaneous consistent + snapshots with arbitrary names. After taking a snapshot + of a dataset (or a recursive snapshot of a parent + dataset that will include all child datasets), new data + is written to new blocks (as described above), however + the old blocks are not reclaimed as free space. There + are then two versions of the file system, the snapshot + (what the file system looked like before) and the live + file system; however no additional space is used. As + new data is written to the live file system, new blocks + are allocated to store this data. The apparent size of + the snapshot will grow as the blocks are no longer used + in the live file system, but only in the snapshot. + These snapshots can be mounted (read only) to allow for + the recovery of previous versions of files. It is also + possible to rollback - a live file system to a specific snapshot, undoing any - changes that took place after the snapshot was taken. - Each block in the zpool has a reference counter which - indicates how many snapshots, clones, datasets or - volumes make use of that block. As files and - snapshots are deleted, the reference count is - decremented; once a block is no longer referenced, it - is reclaimed as free space. Snapshots can also be - marked with a hold, - once a snapshot is held, any attempt to destroy it - will return an EBUY error. Each snapshot can have - multiple holds, each with a unique name. The release - command removes the hold so the snapshot can then be - deleted. Snapshots can be taken on volumes, however - they can only be cloned or rolled back, not mounted - independently. - - - - Clone - - Snapshots can also be cloned; a clone is a - writable version of a snapshot, allowing the file - system to be forked as a new dataset. As with a - snapshot, a clone initially consumes no additional - space, only as new data is written to a clone and new - blocks are allocated does the apparent size of the - clone grow. As blocks are overwritten in the cloned - file system or volume, the reference count on the - previous block is decremented. The snapshot upon - which a clone is based cannot be deleted because the - clone is dependeant upon it (the snapshot is the - parent, and the clone is the child). Clones can be - promoted, reversing this - dependeancy, making the clone the parent and the - previous parent the child. This operation requires no - additional space, however it will change the way the - used space is accounted. - - - - Checksum - - Every block that is allocated is also checksummed - (which algorithm is used is a per dataset property, - see: zfs set). ZFS transparently validates the - checksum of each block as it is read, allowing ZFS to - detect silent corruption. If the data that is read - does not match the expected checksum, ZFS will attempt - to recover the data from any available redundancy - (mirrors, RAID-Z). You can trigger the validation of - all checksums using the scrub - command. The available checksum algorithms include: - - fletcher2 - fletcher4 - sha256 - The fletcher algorithms are faster, - but sha256 is a strong cryptographic hash and has a - much lower chance of a collisions at the cost of some - performance. Checksums can be disabled but it is - inadvisable. - - - - Compression - - Each dataset in ZFS has a compression property, - which defaults to off. This property can be set to - one of a number of compression algorithms, which will - cause all new data that is written to this dataset to - be compressed as it is written. In addition to the - reduction in disk usage, this can also increase read - and write throughput, as only the smaller compressed - version of the file needs to be read or - written. - LZ4 compression is only available after &os; - 9.2 - - - - - Deduplication - - ZFS has the ability to detect duplicate blocks of - data as they are written (thanks to the checksumming - feature). If deduplication is enabled, instead of - writing the block a second time, the reference count - of the existing block will be increased, saving - storage space. In order to do this, ZFS keeps a - deduplication table (DDT) in - memory, containing the list of unique checksums, the - location of that block and a reference count. When - new data is written, the checksum is calculated and - compared to the list. If a match is found, the data - is considered to be a duplicate. When deduplication - is enabled, the checksum algorithm is changed to - SHA256 to provide a secure - cryptographic hash. ZFS deduplication is tunable; if - dedup is on, then a matching checksum is assumed to - mean that the data is identical. If dedup is set to - verify, then the data in the two blocks will be - checked byte-for-byte to ensure it is actually - identical and if it is not, the hash collision will be - noted by ZFS and the two blocks will be stored - separately. Due to the nature of the - DDT, having to store the hash of - each unique block, it consumes a very large amount of - memory (a general rule of thumb is 5-6 GB of ram - per 1 TB of deduplicated data). In situations - where it is not practical to have enough - RAM to keep the entire DDT in - memory, performance will suffer greatly as the DDT - will need to be read from disk before each new block - is written. Deduplication can make use of the L2ARC - to store the DDT, providing a middle ground between - fast system memory and slower disks. It is advisable - to consider using ZFS compression instead, which often - provides nearly as much space savings without the - additional memory requirement. - - - - Scrub - - In place of a consistency check like fsck, ZFS - has the scrub command, which reads - all data blocks stored on the pool and verifies their - checksums them against the known good checksums stored - in the metadata. This periodic check of all the data - stored on the pool ensures the recovery of any - corrupted blocks before they are needed. A scrub is - not required after an unclean shutdown, but it is - recommended that you run a scrub at least once each - quarter. ZFS compares the checksum for each block as - it is read in the normal course of use, but a scrub - operation makes sure even infrequently used blocks are - checked for silent corruption. - - - - Dataset Quota - - ZFS provides very fast and accurate dataset, user - and group space accounting in addition to quotes and - space reservations. This gives the administrator fine - grained control over how space is allocated and allows - critical file systems to reserve space to ensure other - file systems do not take all of the free space. - ZFS supports different types of quotas: the - dataset quota, the + + + + Clone + + Snapshots can also be cloned; a clone is a writable + version of a snapshot, allowing the file system to be + forked as a new dataset. As with a snapshot, a clone + initially consumes no additional space, only as new data + is written to a clone and new blocks are allocated does + the apparent size of the clone grow. As blocks are + overwritten in the cloned file system or volume, the + reference count on the previous block is decremented. + The snapshot upon which a clone is based cannot be + deleted because the clone is dependeant upon it (the + snapshot is the parent, and the clone is the child). + Clones can be promoted, reversing + this dependeancy, making the clone the parent and the + previous parent the child. This operation requires no + additional space, however it will change the way the + used space is accounted. + + + + Checksum + + Every block that is allocated is also checksummed + (which algorithm is used is a per dataset property, see: + zfs set). ZFS transparently validates the checksum of + each block as it is read, allowing ZFS to detect silent + corruption. If the data that is read does not match the + expected checksum, ZFS will attempt to recover the data + from any available redundancy (mirrors, RAID-Z). You + can trigger the validation of all checksums using the + scrub + command. The available checksum algorithms include: + + + + fletcher2 + + + + fletcher4 + + + + sha256 + + + + The fletcher algorithms are faster, but sha256 is a + strong cryptographic hash and has a much lower chance of + a collisions at the cost of some performance. Checksums + can be disabled but it is inadvisable. + + + + Compression + + Each dataset in ZFS has a compression property, + which defaults to off. This property can be set to one + of a number of compression algorithms, which will cause + all new data that is written to this dataset to be + compressed as it is written. In addition to the + reduction in disk usage, this can also increase read and + write throughput, as only the smaller compressed version + of the file needs to be read or written. + + + LZ4 compression is only available after &os; + 9.2 + + + + + Deduplication + + ZFS has the ability to detect duplicate blocks of + data as they are written (thanks to the checksumming + feature). If deduplication is enabled, instead of + writing the block a second time, the reference count of + the existing block will be increased, saving storage + space. In order to do this, ZFS keeps a deduplication + table (DDT) in memory, containing the + list of unique checksums, the location of that block and + a reference count. When new data is written, the + checksum is calculated and compared to the list. If a + match is found, the data is considered to be a + duplicate. When deduplication is enabled, the checksum + algorithm is changed to SHA256 to + provide a secure cryptographic hash. ZFS deduplication + is tunable; if dedup is on, then a matching checksum is + assumed to mean that the data is identical. If dedup is + set to verify, then the data in the two blocks will be + checked byte-for-byte to ensure it is actually identical + and if it is not, the hash collision will be noted by + ZFS and the two blocks will be stored separately. Due + to the nature of the DDT, having to + store the hash of each unique block, it consumes a very + large amount of memory (a general rule of thumb is + 5-6 GB of ram per 1 TB of deduplicated data). + In situations where it is not practical to have enough + RAM to keep the entire DDT in memory, *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***