From owner-svn-doc-projects@FreeBSD.ORG Thu Aug 15 02:28:44 2013 Return-Path: Delivered-To: svn-doc-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 5171BBCC; Thu, 15 Aug 2013 02:28:44 +0000 (UTC) (envelope-from wblock@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3CBD02B6D; Thu, 15 Aug 2013 02:28:44 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id r7F2SiYG088136; Thu, 15 Aug 2013 02:28:44 GMT (envelope-from wblock@svn.freebsd.org) Received: (from wblock@localhost) by svn.freebsd.org (8.14.7/8.14.5/Submit) id r7F2SiIN088135; Thu, 15 Aug 2013 02:28:44 GMT (envelope-from wblock@svn.freebsd.org) Message-Id: <201308150228.r7F2SiIN088135@svn.freebsd.org> From: Warren Block Date: Thu, 15 Aug 2013 02:28:44 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-projects@freebsd.org Subject: svn commit: r42548 - projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs X-SVN-Group: doc-projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-projects@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: SVN commit messages for doc projects trees List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Aug 2013 02:28:44 -0000 Author: wblock Date: Thu Aug 15 02:28:43 2013 New Revision: 42548 URL: http://svnweb.freebsd.org/changeset/doc/42548 Log: Whitespace-only fixes. Translators, please ignore. Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml ============================================================================== --- projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Thu Aug 15 02:01:36 2013 (r42547) +++ projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Thu Aug 15 02:28:43 2013 (r42548) @@ -36,32 +36,35 @@ What Makes <acronym>ZFS</acronym> Different - ZFS is significantly different from any previous file system - owing to the fact that it is more than just a file system. ZFS - combines the traditionally separate roles of volume manager and - file system, which provides unique advantages because the file - system is now aware of the underlying structure of the disks. - Traditional file systems could only be created on a single disk - at a time, if there were two disks then two separate file - systems would have to be created. In a traditional hardware + ZFS is significantly different from any + previous file system owing to the fact that it is more than just + a file system. ZFS combines the + traditionally separate roles of volume manager and file system, + which provides unique advantages because the file system is now + aware of the underlying structure of the disks. Traditional + file systems could only be created on a single disk at a time, + if there were two disks then two separate file systems would + have to be created. In a traditional hardware RAID configuration, this problem was worked around by presenting the operating system with a single logical disk made up of the space provided by a number of disks, on top of which the operating system placed its file system. Even in the case of software RAID solutions like - GEOM, the UFS file system living on top of - the RAID transform believed that it was - dealing with a single device. ZFS's combination of the volume - manager and the file system solves this and allows the creation - of many file systems all sharing a pool of available storage. - One of the biggest advantages to ZFS's awareness of the physical - layout of the disks is that ZFS can grow the existing file - systems automatically when additional disks are added to the - pool. This new space is then made available to all of the file - systems. ZFS also has a number of different properties that can - be applied to each file system, creating many advantages to - creating a number of different filesystems and datasets rather - than a single monolithic filesystem. + GEOM, the UFS file system + living on top of the RAID transform believed + that it was dealing with a single device. + ZFS's combination of the volume manager and + the file system solves this and allows the creation of many file + systems all sharing a pool of available storage. One of the + biggest advantages to ZFS's awareness of the + physical layout of the disks is that ZFS can + grow the existing file systems automatically when additional + disks are added to the pool. This new space is then made + available to all of the file systems. ZFS + also has a number of different properties that can be applied to + each file system, creating many advantages to creating a number + of different filesystems and datasets rather than a single + monolithic filesystem. @@ -69,7 +72,8 @@ There is a start up mechanism that allows &os; to mount ZFS pools during system initialization. To - enable it, add this line to /etc/rc.conf: + enable it, add this line to + /etc/rc.conf: zfs_enable="YES" @@ -135,8 +139,9 @@ drwxr-xr-x 21 root wheel 512 Aug 29 2 &prompt.root; zfs set compression=off example/compressed - To unmount a file system, use zfs umount and - then verify by using df: + To unmount a file system, use + zfs umount and then verify by using + df: &prompt.root; zfs umount example/compressed &prompt.root; df @@ -146,8 +151,9 @@ devfs 1 1 0 /dev/ad0s1d 54098308 1032864 48737580 2% /usr example 17547008 0 17547008 0% /example - To re-mount the file system to make it accessible again, use zfs mount - and verify with df: + To re-mount the file system to make it accessible again, + use zfs mount and verify with + df: &prompt.root; zfs mount example/compressed &prompt.root; df @@ -214,9 +220,9 @@ example/data 17547008 0 175 There is no way to prevent a disk from failing. One method of avoiding data loss due to a failed hard disk is to implement RAID. ZFS - supports this feature in its pool design. RAID-Z pools - require 3 or more disks but yield more usable space than - mirrored pools. + supports this feature in its pool design. + RAID-Z pools require 3 or more disks but + yield more usable space than mirrored pools. To create a RAID-Z pool, issue the following command and specify the disks to add to the @@ -727,31 +733,35 @@ errors: No known data errors Some of the features provided by ZFS are RAM-intensive, so some tuning may be required to provide - maximum efficiency on systems with limited RAM. + maximum efficiency on systems with limited + RAM. Memory At a bare minimum, the total system memory should be at - least one gigabyte. The amount of recommended RAM depends - upon the size of the pool and the ZFS features which are - used. A general rule of thumb is 1 GB of RAM for every 1 TB - of storage. If the deduplication feature is used, a general - rule of thumb is 5 GB of RAM per TB of storage to be - deduplicated. While some users successfully use ZFS with - less RAM, it is possible that when the system is under heavy - load, it may panic due to memory exhaustion. Further tuning - may be required for systems with less than the recommended - RAM requirements. + least one gigabyte. The amount of recommended + RAM depends upon the size of the pool and + the ZFS features which are used. A + general rule of thumb is 1 GB of RAM for every + 1 TB of storage. If the deduplication feature is used, + a general rule of thumb is 5 GB of RAM per TB of + storage to be deduplicated. While some users successfully + use ZFS with less RAM, + it is possible that when the system is under heavy load, it + may panic due to memory exhaustion. Further tuning may be + required for systems with less than the recommended RAM + requirements. Kernel Configuration - Due to the RAM limitations of the &i386; platform, users - using ZFS on the &i386; architecture should add the - following option to a custom kernel configuration file, - rebuild the kernel, and reboot: + Due to the RAM limitations of the + &i386; platform, users using ZFS on the + &i386; architecture should add the following option to a + custom kernel configuration file, rebuild the kernel, and + reboot: options KVA_PAGES=512 @@ -831,20 +841,22 @@ vfs.zfs.vdev.cache.size="5M" <acronym>ZFS</acronym> Features and Terminology - ZFS is a fundamentally different file system because it - is more than just a file system. ZFS combines the roles of - file system and volume manager, enabling additional storage - devices to be added to a live system and having the new space - available on all of the existing file systems in that pool - immediately. By combining the traditionally separate roles, - ZFS is able to overcome previous limitations that prevented - RAID groups being able to grow. Each top level device in a - zpool is called a vdev, which can be a simple disk or a RAID - transformation such as a mirror or RAID-Z array. ZFS file - systems (called datasets), each have access to the combined - free space of the entire pool. As blocks are allocated from - the pool, the space available to each file system - decreases. This approach avoids the common pitfall with + ZFS is a fundamentally different file + system because it is more than just a file system. + ZFS combines the roles of file system and + volume manager, enabling additional storage devices to be added + to a live system and having the new space available on all of + the existing file systems in that pool immediately. By + combining the traditionally separate roles, + ZFS is able to overcome previous limitations + that prevented RAID groups being able to + grow. Each top level device in a zpool is called a vdev, which + can be a simple disk or a RAID transformation + such as a mirror or RAID-Z array. + ZFS file systems (called datasets), each have + access to the combined free space of the entire pool. As blocks + are allocated from the pool, the space available to each file + system decreases. This approach avoids the common pitfall with extensive partitioning where free space becomes fragmentated across the partitions. @@ -855,21 +867,22 @@ vfs.zfs.vdev.cache.size="5M"zpool A storage pool is the most basic building block of - ZFS. A pool is made up of one or more vdevs, the - underlying devices that store the data. A pool is then - used to create one or more file systems (datasets) or - block devices (volumes). These datasets and volumes - share the pool of remaining free space. Each pool is - uniquely identified by a name and a + ZFS. A pool is made up of one or + more vdevs, the underlying devices that store the data. + A pool is then used to create one or more file systems + (datasets) or block devices (volumes). These datasets + and volumes share the pool of remaining free space. + Each pool is uniquely identified by a name and a GUID. The zpool also controls the version number and therefore the features available for use with ZFS. - &os; 9.0 and 9.1 include support for ZFS version - 28. Future versions use ZFS version 5000 with - feature flags. This allows greater - cross-compatibility with other implementations of + &os; 9.0 and 9.1 include support for + ZFS version 28. Future versions + use ZFS version 5000 with feature + flags. This allows greater cross-compatibility with + other implementations of ZFS. @@ -879,9 +892,10 @@ vfs.zfs.vdev.cache.size="5M"A zpool is made up of one or more vdevs, which themselves can be a single disk or a group of disks, in - the case of a RAID transform. When multiple vdevs are - used, ZFS spreads data across the vdevs to increase - performance and maximize usable space. + the case of a RAID transform. When + multiple vdevs are used, ZFS spreads + data across the vdevs to increase performance and + maximize usable space. @@ -901,12 +915,12 @@ vfs.zfs.vdev.cache.size="5M" - File - In addition to - disks, ZFS pools can be backed by regular files, - this is especially useful for testing and - experimentation. Use the full path to the file - as the device path in the zpool create command. - All vdevs must be atleast 128 MB in + File - In addition to disks, + ZFS pools can be backed by + regular files, this is especially useful for + testing and experimentation. Use the full path to + the file as the device path in the zpool create + command. All vdevs must be atleast 128 MB in size. @@ -934,86 +948,93 @@ vfs.zfs.vdev.cache.size="5M" RAID-Z - - ZFS implements RAID-Z, a variation on standard - RAID-5 that offers better distribution of parity - and eliminates the "RAID-5 write hole" in which + ZFS implements + RAID-Z, a variation on standard + RAID-5 that offers better + distribution of parity and eliminates the + "RAID-5 write hole" in which the data and parity information become - inconsistent after an unexpected restart. ZFS - supports 3 levels of RAID-Z which provide - varying levels of redundancy in exchange for - decreasing levels of usable storage. The types - are named RAID-Z1 through RAID-Z3 based on the number - of parity devinces in the array and the number - of disks that the pool can operate - without. - - In a RAID-Z1 configuration with 4 disks, - each 1 TB, usable storage will be 3 TB - and the pool will still be able to operate in - degraded mode with one faulted disk. If an - additional disk goes offline before the faulted - disk is replaced and resilvered, all data in the - pool can be lost. - - In a RAID-Z3 configuration with 8 disks of - 1 TB, the volume would provide 5 TB of - usable space and still be able to operate with - three faulted disks. Sun recommends no more - than 9 disks in a single vdev. If the - configuration has more disks, it is recommended - to divide them into separate vdevs and the pool - data will be striped across them. - - A configuration of 2 RAID-Z2 vdevs - consisting of 8 disks each would create - something similar to a RAID-60 array. A RAID-Z - group's storage capacity is approximately the - size of the smallest disk, multiplied by the - number of non-parity disks. Four 1 TB disks - in RAID-Z1 has an effective size of approximately - 3 TB, and an array of eight 1 TB disks in RAID-Z3 will - yield 5 TB of usable space. + inconsistent after an unexpected restart. + ZFS supports 3 levels of + RAID-Z which provide varying + levels of redundancy in exchange for decreasing + levels of usable storage. The types are named + RAID-Z1 through + RAID-Z3 based on the number of + parity devinces in the array and the number of + disks that the pool can operate without. + + In a RAID-Z1 configuration + with 4 disks, each 1 TB, usable storage will + be 3 TB and the pool will still be able to + operate in degraded mode with one faulted disk. + If an additional disk goes offline before the + faulted disk is replaced and resilvered, all data + in the pool can be lost. + + In a RAID-Z3 configuration + with 8 disks of 1 TB, the volume would + provide 5 TB of usable space and still be + able to operate with three faulted disks. Sun + recommends no more than 9 disks in a single vdev. + If the configuration has more disks, it is + recommended to divide them into separate vdevs and + the pool data will be striped across them. + + A configuration of 2 + RAID-Z2 vdevs consisting of 8 + disks each would create something similar to a + RAID-60 array. A + RAID-Z group's storage capacity + is approximately the size of the smallest disk, + multiplied by the number of non-parity disks. + Four 1 TB disks in RAID-Z1 + has an effective size of approximately 3 TB, + and an array of eight 1 TB disks in + RAID-Z3 will yield 5 TB of + usable space. - Spare - ZFS has a special - pseudo-vdev type for keeping track of available - hot spares. Note that installed hot spares are - not deployed automatically; they must manually - be configured to replace the failed device using + Spare - + ZFS has a special pseudo-vdev + type for keeping track of available hot spares. + Note that installed hot spares are not deployed + automatically; they must manually be configured to + replace the failed device using zfs replace. - Log - ZFS Log Devices, also - known as ZFS Intent Log (ZIL) - move the intent log from the regular pool - devices to a dedicated device. The ZIL - accelerates synchronous transactions by using - storage devices (such as - SSDs) that are faster - than those used for the main pool. When - data is being written and the application - requests a guarantee that the data has been - safely stored, the data is written to the faster - ZIL storage, then later flushed out to the - regular disks, greatly reducing the latency of - synchronous writes. Log devices can be - mirrored, but RAID-Z is not supported. If - multiple log devices are used, writes will be - load balanced across them. + Log - ZFS + Log Devices, also known as ZFS Intent Log + (ZIL) move the intent log from + the regular pool devices to a dedicated device. + The ZIL accelerates synchronous + transactions by using storage devices (such as + SSDs) that are faster than + those used for the main pool. When data is being + written and the application requests a guarantee + that the data has been safely stored, the data is + written to the faster ZIL + storage, then later flushed out to the regular + disks, greatly reducing the latency of synchronous + writes. Log devices can be mirrored, but + RAID-Z is not supported. If + multiple log devices are used, writes will be load + balanced across them. Cache - Adding a cache vdev to a zpool will add the storage of the cache to - the L2ARC. Cache devices cannot be mirrored. - Since a cache device only stores additional - copies of existing data, there is no risk of - data loss. + the L2ARC. Cache devices + cannot be mirrored. Since a cache device only + stores additional copies of existing data, there + is no risk of data loss. @@ -1022,51 +1043,53 @@ vfs.zfs.vdev.cache.size="5M"Adaptive Replacement Cache (ARC) - ZFS uses an Adaptive Replacement Cache - (ARC), rather than a more - traditional Least Recently Used - (LRU) cache. An - LRU cache is a simple list of items - in the cache sorted by when each object was most - recently used; new items are added to the top of the - list and once the cache is full items from the bottom - of the list are evicted to make room for more active - objects. An ARC consists of four - lists; the Most Recently Used (MRU) - and Most Frequently Used (MFU) - objects, plus a ghost list for each. These ghost - lists track recently evicted objects to prevent them - from being added back to the cache. This increases the - cache hit ratio by avoiding objects that have a - history of only being used occasionally. Another - advantage of using both an MRU and - MFU is that scanning an entire - filesystem would normally evict all data from an - MRU or LRU cache - in favor of this freshly accessed content. In the - case of ZFS, since there is also an + ZFS uses an Adaptive Replacement + Cache (ARC), rather than a more + traditional Least Recently Used (LRU) + cache. An LRU cache is a simple list + of items in the cache sorted by when each object was + most recently used; new items are added to the top of + the list and once the cache is full items from the + bottom of the list are evicted to make room for more + active objects. An ARC consists of + four lists; the Most Recently Used + (MRU) and Most Frequently Used + (MFU) objects, plus a ghost list for + each. These ghost lists track recently evicted objects + to prevent them from being added back to the cache. + This increases the cache hit ratio by avoiding objects + that have a history of only being used occasionally. + Another advantage of using both an + MRU and MFU is + that scanning an entire filesystem would normally evict + all data from an MRU or + LRU cache in favor of this freshly + accessed content. In the case of + ZFS, since there is also an MFU that only tracks the most - frequently used objects, the cache of the most - commonly accessed blocks remains. + frequently used objects, the cache of the most commonly + accessed blocks remains. - L2ARC + L2ARC The L2ARC is the second level of the ZFS caching system. The primary ARC is stored in RAM, however since the amount of available RAM is often limited, - ZFS can also make use of cache + ZFS can also make use of + cache vdevs. Solid State Disks (SSDs) are often used as these cache devices due to their higher speed and lower latency compared to traditional spinning - disks. An L2ARC is entirely optional, but having one - will significantly increase read speeds for files that - are cached on the SSD instead of - having to be read from the regular spinning disks. The + disks. An L2ARC is entirely + optional, but having one will significantly increase + read speeds for files that are cached on the + SSD instead of having to be read from + the regular spinning disks. The L2ARC can also speed up deduplication since a DDT that does not fit in @@ -1092,48 +1115,51 @@ vfs.zfs.vdev.cache.size="5M"Copy-On-Write Unlike a traditional file system, when data is - overwritten on ZFS the new data is written to a - different block rather than overwriting the old data in - place. Only once this write is complete is the metadata - then updated to point to the new location of the data. - This means that in the event of a shorn write (a system - crash or power loss in the middle of writing a file), the - entire original contents of the file are still available - and the incomplete write is discarded. This also means - that ZFS does not require a &man.fsck.8; after an unexpected + overwritten on ZFS the new data is + written to a different block rather than overwriting the + old data in place. Only once this write is complete is + the metadata then updated to point to the new location + of the data. This means that in the event of a shorn + write (a system crash or power loss in the middle of + writing a file), the entire original contents of the + file are still available and the incomplete write is + discarded. This also means that ZFS + does not require a &man.fsck.8; after an unexpected shutdown. Dataset - Dataset is the generic term for a ZFS file system, - volume, snapshot or clone. Each dataset will have a - unique name in the format: - poolname/path@snapshot. The root of - the pool is technically a dataset as well. Child - datasets are named hierarchically like directories; for - example, mypool/home, the home dataset, - is a child of mypool and inherits properties from it. - This can be expanded further by creating - mypool/home/user. This grandchild - dataset will inherity properties from the parent and - grandparent. It is also possible to set properties - on a child to override the defaults inherited from the - parents and grandparents. ZFS also allows - administration of datasets and their children to be - delegated. + Dataset is the generic term for a + ZFS file system, volume, snapshot or + clone. Each dataset will have a unique name in the + format: poolname/path@snapshot. The + root of the pool is technically a dataset as well. + Child datasets are named hierarchically like + directories; for example, + mypool/home, the home dataset, is a + child of mypool and inherits + properties from it. This can be expanded further by + creating mypool/home/user. This + grandchild dataset will inherity properties from the + parent and grandparent. It is also possible to set + properties on a child to override the defaults inherited + from the parents and grandparents. + ZFS also allows administration of + datasets and their children to be delegated. Volume - In additional to regular file system datasets, ZFS - can also create volumes, which are block devices. - Volumes have many of the same features, including - copy-on-write, snapshots, clones and checksumming. - Volumes can be useful for running other file system - formats on top of ZFS, such as UFS or in the case of + In additional to regular file system datasets, + ZFS can also create volumes, which + are block devices. Volumes have many of the same + features, including copy-on-write, snapshots, clones and + checksumming. Volumes can be useful for running other + file system formats on top of ZFS, + such as UFS or in the case of Virtualization or exporting iSCSI extents. @@ -1142,41 +1168,40 @@ vfs.zfs.vdev.cache.size="5M"Snapshot The copy-on-write - - design of ZFS allows for nearly instantaneous consistent - snapshots with arbitrary names. After taking a snapshot - of a dataset (or a recursive snapshot of a parent - dataset that will include all child datasets), new data - is written to new blocks (as described above), however - the old blocks are not reclaimed as free space. There - are then two versions of the file system, the snapshot - (what the file system looked like before) and the live - file system; however no additional space is used. As - new data is written to the live file system, new blocks - are allocated to store this data. The apparent size of - the snapshot will grow as the blocks are no longer used - in the live file system, but only in the snapshot. - These snapshots can be mounted (read only) to allow for - the recovery of previous versions of files. It is also - possible to rollback - a live file system to a specific snapshot, undoing any - changes that took place after the snapshot was taken. - Each block in the zpool has a reference counter which + linkend="zfs-term-cow">copy-on-write design of + ZFS allows for nearly instantaneous + consistent snapshots with arbitrary names. After taking + a snapshot of a dataset (or a recursive snapshot of a + parent dataset that will include all child datasets), + new data is written to new blocks (as described above), + however the old blocks are not reclaimed as free space. + There are then two versions of the file system, the + snapshot (what the file system looked like before) and + the live file system; however no additional space is + used. As new data is written to the live file system, + new blocks are allocated to store this data. The + apparent size of the snapshot will grow as the blocks + are no longer used in the live file system, but only in + the snapshot. These snapshots can be mounted (read + only) to allow for the recovery of previous versions of + files. It is also possible to + rollback a live + file system to a specific snapshot, undoing any changes + that took place after the snapshot was taken. Each + block in the zpool has a reference counter which indicates how many snapshots, clones, datasets or volumes make use of that block. As files and snapshots are deleted, the reference count is decremented; once a block is no longer referenced, it is reclaimed as free - space. Snapshots can also be marked with a hold, - once a snapshot is held, any attempt to destroy it will - return an EBUY error. Each snapshot can have multiple - holds, each with a unique name. The release - command removes the hold so the snapshot can then be - deleted. Snapshots can be taken on volumes, however - they can only be cloned or rolled back, not mounted + space. Snapshots can also be marked with a + hold, once a + snapshot is held, any attempt to destroy it will return + an EBUY error. Each snapshot can have multiple holds, + each with a unique name. The + release command + removes the hold so the snapshot can then be deleted. + Snapshots can be taken on volumes, however they can only + be cloned or rolled back, not mounted independently. @@ -1206,13 +1231,16 @@ vfs.zfs.vdev.cache.size="5M"Every block that is allocated is also checksummed (the algorithm used is a per dataset property, see: - zfs set). ZFS transparently validates the checksum of - each block as it is read, allowing ZFS to detect silent - corruption. If the data that is read does not match the - expected checksum, ZFS will attempt to recover the data - from any available redundancy, like mirrors or RAID-Z). Validation of all checksums can be triggered with - the - scrub + zfs set). ZFS + transparently validates the checksum of each block as it + is read, allowing ZFS to detect + silent corruption. If the data that is read does not + match the expected checksum, ZFS will + attempt to recover the data from any available + redundancy, like mirrors or RAID-Z). + Validation of all checksums can be triggered with the + scrub command. Available checksum algorithms include: @@ -1238,90 +1266,96 @@ vfs.zfs.vdev.cache.size="5M" Compression - Each dataset in ZFS has a compression property, - which defaults to off. This property can be set to one - of a number of compression algorithms, which will cause - all new data that is written to this dataset to be - compressed as it is written. In addition to the - reduction in disk usage, this can also increase read and - write throughput, as only the smaller compressed version - of the file needs to be read or written. + Each dataset in ZFS has a + compression property, which defaults to off. This + property can be set to one of a number of compression + algorithms, which will cause all new data that is + written to this dataset to be compressed as it is + written. In addition to the reduction in disk usage, + this can also increase read and write throughput, as + only the smaller compressed version of the file needs to + be read or written. - LZ4 compression is only available after &os; - 9.2 + LZ4 compression is only + available after &os; 9.2 Deduplication - ZFS has the ability to detect duplicate blocks of - data as they are written (thanks to the checksumming - feature). If deduplication is enabled, instead of - writing the block a second time, the reference count of - the existing block will be increased, saving storage - space. To do this, ZFS keeps a deduplication - table (DDT) in memory, containing the - list of unique checksums, the location of that block and - a reference count. When new data is written, the - checksum is calculated and compared to the list. If a - match is found, the data is considered to be a - duplicate. When deduplication is enabled, the checksum - algorithm is changed to SHA256 to - provide a secure cryptographic hash. ZFS deduplication - is tunable; if dedup is on, then a matching checksum is - assumed to mean that the data is identical. If dedup is - set to verify, then the data in the two blocks will be - checked byte-for-byte to ensure it is actually identical - and if it is not, the hash collision will be noted by - ZFS and the two blocks will be stored separately. Due - to the nature of the DDT, having to - store the hash of each unique block, it consumes a very - large amount of memory (a general rule of thumb is - 5-6 GB of ram per 1 TB of deduplicated data). - In situations where it is not practical to have enough - RAM to keep the entire DDT in memory, - performance will suffer greatly as the DDT will need to - be read from disk before each new block is written. - Deduplication can make use of the L2ARC to store the - DDT, providing a middle ground between fast system - memory and slower disks. Consider - using ZFS compression instead, which often provides - nearly as much space savings without the additional - memory requirement. + ZFS has the ability to detect + duplicate blocks of data as they are written (thanks to + the checksumming feature). If deduplication is enabled, + instead of writing the block a second time, the + reference count of the existing block will be increased, + saving storage space. To do this, + ZFS keeps a deduplication table + (DDT) in memory, containing the list + of unique checksums, the location of that block and a + reference count. When new data is written, the checksum + is calculated and compared to the list. If a match is + found, the data is considered to be a duplicate. When + deduplication is enabled, the checksum algorithm is + changed to SHA256 to provide a secure + cryptographic hash. ZFS + deduplication is tunable; if dedup is on, then a + matching checksum is assumed to mean that the data is + identical. If dedup is set to verify, then the data in + the two blocks will be checked byte-for-byte to ensure + it is actually identical and if it is not, the hash + collision will be noted by ZFS and + the two blocks will be stored separately. Due to the + nature of the DDT, having to store + the hash of each unique block, it consumes a very large + amount of memory (a general rule of thumb is 5-6 GB + of ram per 1 TB of deduplicated data). In + situations where it is not practical to have enough + RAM to keep the entire + DDT in memory, performance will + suffer greatly as the DDT will need + to be read from disk before each new block is written. + Deduplication can make use of the + L2ARC to store the + DDT, providing a middle ground + between fast system memory and slower disks. Consider + using ZFS compression instead, which + often provides nearly as much space savings without the + additional memory requirement. Scrub - In place of a consistency check like &man.fsck.8;, ZFS has - the scrub command, which reads all - data blocks stored on the pool and verifies their - checksums them against the known good checksums stored - in the metadata. This periodic check of all the data - stored on the pool ensures the recovery of any corrupted - blocks before they are needed. A scrub is not required - after an unclean shutdown, but it is recommended that - you run a scrub at least once each quarter. ZFS - compares the checksum for each block as it is read in - the normal course of use, but a scrub operation makes - sure even infrequently used blocks are checked for - silent corruption. + In place of a consistency check like &man.fsck.8;, + ZFS has the scrub + command, which reads all data blocks stored on the pool + and verifies their checksums them against the known good + checksums stored in the metadata. This periodic check + of all the data stored on the pool ensures the recovery + of any corrupted blocks before they are needed. A scrub + is not required after an unclean shutdown, but it is + recommended that you run a scrub at least once each + quarter. ZFS compares the checksum + for each block as it is read in the normal course of + use, but a scrub operation makes sure even infrequently + used blocks are checked for silent corruption. Dataset Quota - ZFS provides very fast and accurate dataset, user - and group space accounting in addition to quotas and - space reservations. This gives the administrator fine - grained control over how space is allocated and allows - critical file systems to reserve space to ensure other - file systems do not take all of the free space. + ZFS provides very fast and + accurate dataset, user and group space accounting in + addition to quotas and space reservations. This gives + the administrator fine grained control over how space is + allocated and allows critical file systems to reserve + space to ensure other file systems do not take all of + the free space. - ZFS supports different types of quotas: the - dataset quota, the ZFS supports different types of + quotas: the dataset quota, the reference quota (refquota), the user @@ -1381,9 +1415,9 @@ vfs.zfs.vdev.cache.size="5M"storage/home/bob, the space used by - that snapshot is counted against the reservation. The - storage/home/bob, + the space used by that snapshot is counted against the + reservation. The refreservation property works in a similar way, except it excludes descendants, such as