From owner-svn-doc-projects@FreeBSD.ORG Fri Feb 21 15:23:53 2014 Return-Path: Delivered-To: svn-doc-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CFE4AB7E; Fri, 21 Feb 2014 15:23:53 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B8AF11A1C; Fri, 21 Feb 2014 15:23:53 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.8/8.14.8) with ESMTP id s1LFNrZP018865; Fri, 21 Feb 2014 15:23:53 GMT (envelope-from wblock@svn.freebsd.org) Received: (from wblock@localhost) by svn.freebsd.org (8.14.8/8.14.8/Submit) id s1LFNrmZ018864; Fri, 21 Feb 2014 15:23:53 GMT (envelope-from wblock@svn.freebsd.org) Message-Id: <201402211523.s1LFNrmZ018864@svn.freebsd.org> From: Warren Block Date: Fri, 21 Feb 2014 15:23:53 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-projects@freebsd.org Subject: svn commit: r44014 - projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs X-SVN-Group: doc-projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-projects@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for doc projects trees List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Feb 2014 15:23:53 -0000 Author: wblock Date: Fri Feb 21 15:23:53 2014 New Revision: 44014 URL: http://svnweb.freebsd.org/changeset/doc/44014 Log: Edits and new content from Allan Jude . Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml ============================================================================== --- projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Fri Feb 21 13:26:41 2014 (r44013) +++ projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Fri Feb 21 15:23:53 2014 (r44014) @@ -86,8 +86,8 @@ - A complete list of ZFS features and - terminology is shown in . + A complete list of features and terminology is shown in + . What Makes <acronym>ZFS</acronym> Different @@ -123,7 +123,7 @@ - <acronym>ZFS</acronym> Quick Start Guide + Quick Start Guide There is a start up mechanism that allows &os; to mount ZFS pools during system initialization. To @@ -148,9 +148,8 @@ Single Disk Pool - To create a simple, non-redundant ZFS - pool using a single disk device, use - zpool: + To create a simple, non-redundant pool using a single + disk device, use zpool create: &prompt.root; zpool create example /dev/da0 @@ -270,7 +269,7 @@ example/data 17547008 0 175 - <acronym>ZFS</acronym> RAID-Z + RAID-Z Disks fail. One method of avoiding data loss from disk failure is to @@ -481,11 +480,11 @@ errors: No known data errors &prompt.root; zpool scrub storage The duration of a scrub depends on the amount of data - stored. Large amounts of data can take a considerable amount - of time to verify. It is also very I/O - intensive, so much so that only one scrub> may be run at any - given time. After the scrub has completed, the status is - updated and may be viewed with a status request: + stored. Larger amounts of data will take considerably longer + to verify. Scrubs are very I/O intensive, + so much so that only one scrub may be run at a time. After + the scrub has completed, the status is updated and may be + viewed with status: &prompt.root; zpool status storage pool: storage @@ -502,9 +501,10 @@ config: errors: No known data errors - The completion time is displayed and helps to ensure data - integrity over a long period of time. - + The completion date of the last scrub operation is + displayed to help track when another scrub is required. + Routine pool scrubs help protect data from silent corruption + and ensure the integrity of the pool. Refer to &man.zfs.8; and &man.zpool.8; for other ZFS options. @@ -581,6 +581,51 @@ errors: No known data errors redundancy. + + Checking the Status of a Pool + + Pool status is important. If a drive goes offline or a + read, write, or checksum error is detected, the error + counter in status is incremented. The + status output shows the configuration and + status of each device in the pool, in addition to the status + of the entire pool. Actions that need to be taken and details + about the last scrub + are also shown. + + &prompt.root; zpool status + pool: mypool + state: ONLINE + scan: scrub repaired 0 in 2h25m with 0 errors on Sat Sep 14 04:25:50 2013 +config: + + NAME STATE READ WRITE CKSUM + mypool ONLINE 0 0 0 + raidz2-0 ONLINE 0 0 0 + ada0p3 ONLINE 0 0 0 + ada1p3 ONLINE 0 0 0 + ada2p3 ONLINE 0 0 0 + ada3p3 ONLINE 0 0 0 + ada4p3 ONLINE 0 0 0 + ada5p3 ONLINE 0 0 0 + +errors: No known data errors + + + + Clearing Errors + + When an error is detected, the read, write, or checksum + counts are incremented. The error message can be cleared and + the counts reset with zpool clear + mypool. Clearing the + error state can be important for automated scripts that alert + the administrator when the pool encounters an error. Further + errors may not be reported if the old errors are not + cleared. + + Replacing a Functioning Device @@ -622,28 +667,60 @@ errors: No known data errors restored from backups. + + Scrubbing a Pool + + Pools should be + Scrubbed regularly, + ideally at least once every three months. The + scrub operating is very disk-intensive and + will reduce performance while running. Avoid high-demand + periods when scheduling scrub. + + &prompt.root; zpool scrub mypool +&prompt.root; zpool status + pool: mypool + state: ONLINE + scan: scrub in progress since Wed Feb 19 20:52:54 2014 + 116G scanned out of 8.60T at 649M/s, 3h48m to go + 0 repaired, 1.32% done +config: + + NAME STATE READ WRITE CKSUM + mypool ONLINE 0 0 0 + raidz2-0 ONLINE 0 0 0 + ada0p3 ONLINE 0 0 0 + ada1p3 ONLINE 0 0 0 + ada2p3 ONLINE 0 0 0 + ada3p3 ONLINE 0 0 0 + ada4p3 ONLINE 0 0 0 + ada5p3 ONLINE 0 0 0 + +errors: No known data errors + + - ZFS Self-Healing + Self-Healing - ZFS utilizes the checkums stored with - each data block to provide a feature called self-healing. - This feature will automatically repair data whose checksum - does not match the one recorded on another device that is part - of the storage pool. For example, a mirror with two disks - where one drive is starting to malfunction and cannot properly - store the data any more. This is even worse when the data has - not been accessed for a long time in long term archive storage - for example. Traditional file systems need to run algorithms - that check and repair the data like the &man.fsck.8; program. - These commands take time and in severe cases, an administrator - has to manually decide which repair operation has to be - performed. When ZFS detects that a data - block is being read whose checksum does not match, it will try - to read the data from the mirror disk. If that disk can - provide the correct data, it will not only give that data to - the application requesting it, but also correct the wrong data - on the disk that had the bad checksum. This happens without - any interaction of a system administrator during normal pool + The checksums stored with data blocks enable the file + system to self-heal. This feature will + automatically repair data whose checksum does not match the + one recorded on another device that is part of the storage + pool. For example, a mirror with two disks where one drive is + starting to malfunction and cannot properly store the data any + more. This is even worse when the data has not been accessed + for a long time in long term archive storage for example. + Traditional file systems need to run algorithms that check and + repair the data like the &man.fsck.8; program. These commands + take time and in severe cases, an administrator has to + manually decide which repair operation has to be performed. + When ZFS detects that a data block is being + read whose checksum does not match, it will try to read the + data from the mirror disk. If that disk can provide the + correct data, it will not only give that data to the + application requesting it, but also correct the wrong data on + the disk that had the bad checksum. This happens without any + interaction of a system administrator during normal pool operation. The following example will demonstrate this self-healing @@ -890,17 +967,38 @@ errors: No known data errors need to be imported on an older system before upgrading. The upgrade process is unreversible and cannot be undone. + &prompt.root; zpool status + pool: mypool + state: ONLINE +status: The pool is formatted using a legacy on-disk format. The pool can + still be used, but some features are unavailable. +action: Upgrade the pool using 'zpool upgrade'. Once this is done, the + pool will no longer be accessible on software that does not support feat + flags. + scan: none requested +config: + + NAME STATE READ WRITE CKSUM + mypool ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 0 + +errors: No known data errors + The newer features of ZFS will not be available until zpool upgrade has completed. can be used to see what new features will be provided by upgrading, as well as which - features are already supported by the existing version. - + features are already supported. - - Checking the Status of a Pool - - + + Systems that boot from a pool must have their boot code + updated to support the new pool version. Run + gpart bootcode on the partition that + contains the boot code. See &man.gpart.8; for more + information. + @@ -1050,11 +1148,11 @@ data 288G 1.53T The zfs utility is responsible for creating, destroying, and managing all ZFS datasets that exist within a pool. The pool is managed using - the zpool - command. + zpool. - Creating & Destroying Datasets + Creating and Destroying Datasets Unlike traditional disks and volume managers, space in ZFS is not preallocated. With @@ -1255,29 +1353,30 @@ tank custom:costcenter - - ZFS Replication + Replication Keeping data on a single pool in one location exposes - it to risks like theft, natural and human disasters. Keeping + it to risks like theft, natural, and human disasters. Making regular backups of the entire pool is vital when data needs to be restored. ZFS provides a built-in serialization feature that can send a stream representation of the data to standard output. Using this technique, it is possible to not only store the data on another pool connected to the local system, but also to send it over a network to - another system that runs ZFS. To achieve this replication, - ZFS uses filesystem snapshots (see the - section on ZFS snapshots) to send - them from one location to another. The commands for this - operation are zfs send and + another system that runs ZFS . To achieve + this replication, ZFS uses filesystem + snapshots (see the section on + ZFS + snapshots) to send them from one location to another. + The commands for this operation are + zfs send and zfs receive, respectively. The following examples will demonstrate the functionality of ZFS replication using these two pools: - &prompt.root; zpool list + &prompt.root; zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 960M 77K 896M 0% 1.00x ONLINE - mypool 984M 43.7M 940M 4% 1.00x ONLINE - @@ -1297,31 +1396,31 @@ mypool 984M 43.7M 940M 4% 1.00x ZFS only replicates snapshots, changes since the most recent snapshot will not be replicated. - &prompt.root; zfs snapshot mypool@backup1 -&prompt.root; zfs list -t snapshot + &prompt.root; zfs snapshot mypool@backup1 +&prompt.root; zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT mypool@backup1 0 - 43.6M - Now that a snapshot exists, zfs send can be used to create a stream representing the contents of the snapshot, which can be stored as a file, or received by - another pool. The stream will be written to standard - output, which will need to be redirected to a file or pipe - otherwise ZFS will produce an error: + another pool. The stream will be written to standard output, + which will need to be redirected to a file or pipe otherwise + ZFS will produce an error: - &prompt.root; zfs send mypool@backup1 + &prompt.root; zfs send mypool@backup1 Error: Stream can not be written to a terminal. You must redirect standard output. To backup a dataset with zfs send, redirect to a file located on the mounted backup pool. First ensure that the pool has enough free space to accommodate the - size of the snapshot you are sending, which means all of the + size of the snapshot being sendt, which means all of the data contained in the snapshot, not only the changes in that snapshot. - &prompt.root; zfs send mypool@backup1 > /backup/backup1 -&prompt.root; zpool list + &prompt.root; zfs send mypool@backup1 > /backup/backup1 +&prompt.root; zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 960M 63.7M 896M 6% 1.00x ONLINE - mypool 984M 43.7M 940M 4% 1.00x ONLINE - @@ -1334,10 +1433,10 @@ mypool 984M 43.7M 940M 4% 1.00x Instead of storing the backups as archive files, ZFS can receive them as a live file system, - allowing the backed up data to be accessed directly. - To get to the actual data contained in those streams, the - reverse operation of zfs send must be used - to transform the streams back into files and directories. The + allowing the backed up data to be accessed directly. To get + to the actual data contained in those streams, the reverse + operation of zfs send must be used to + transform the streams back into files and directories. The command is zfs receive. The example below combines zfs send and zfs receive using a pipe to copy the data @@ -1345,108 +1444,89 @@ mypool 984M 43.7M 940M 4% 1.00x directly on the receiving pool after the transfer is complete. A dataset can only be replicated to an empty dataset. - &prompt.root; zfs snapshot mypool@replica1 -&prompt.root; zfs send -v mypool@replica1 | zfs receive backup/mypool + &prompt.root; zfs snapshot mypool@replica1 +&prompt.root; zfs send -v mypool@replica1 | zfs receive backup/mypool send from @ to mypool@replica1 estimated size is 50.1M total estimated size is 50.1M TIME SENT SNAPSHOT -&prompt.root; zpool list +&prompt.root; zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 960M 63.7M 896M 6% 1.00x ONLINE - mypool 984M 43.7M 940M 4% 1.00x ONLINE - - ZFS Incremental Backups + Incremental Backups - Another feature of zfs send is that - it can determine the difference between two snapshots to - only send what has changed between the two. This results in - saving disk space and time for the transfer to another pool. - For example: + zfs send can also determine the + difference between two snapshots and only send the changes + between the two. This results in saving disk space and + transfer time. For example: - &prompt.root; zfs snapshot mypool@backup2 + &prompt.root; zfs snapshot mypool@replica2 &prompt.root; zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT -mypool@backup1 5.72M - 43.6M - -mypool@backup2 0 - 44.1M - +mypool@replica1 5.72M - 43.6M - +mypool@replica2 0 - 44.1M - &prompt.root; zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 960M 61.7M 898M 6% 1.00x ONLINE - mypool 960M 50.2M 910M 5% 1.00x ONLINE - A second snapshot called - backup2 was created. This second - snapshot contains only the changes on the ZFS filesystem - between now and the last snapshot, - backup1. Using the - -i flag to zfs send - and providing both snapshots, an incremental snapshot can be - transferred, containing only the data that has - changed. + replica2 was created. This + second snapshot contains only the changes on the + ZFS filesystem between now and the + previous snapshot, replica1. + Using with zfs send + and indicating the pair of snapshots, an incremental replica + stream can be generated, containing only the data that has + changed. This can only succeed if the initial snapshot + already exists on the receiving side. + + &prompt.root; zfs send -v -i mypool@replica1 mypool@replica2 | zfs receive /backup/mypool +send from @replica1 to mypool@replica2 estimated size is 5.02M +total estimated size is 5.02M +TIME SENT SNAPSHOT - &prompt.root; zfs send -i mypool@backup1 mypool@backup2 > /backup/incremental &prompt.root; zpool list NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT backup 960M 80.8M 879M 8% 1.00x ONLINE - mypool 960M 50.2M 910M 5% 1.00x ONLINE - -&prompt.root; ls -lh /backup -total 82247 -drwxr-xr-x 1 root wheel 61M Dec 3 11:36 backup1 -drwxr-xr-x 1 root wheel 18M Dec 3 11:36 incremental - The incremental stream was successfully transferred and - the file on disk is smaller than any of the two snapshots - backup1 or - backup2. This shows that it only - contains the differences, which is much faster to transfer - and saves disk space by not copying the complete pool each - time. This is useful when having to rely on slow networks - or when costs per transferred byte have to be - considered. - - - - Receiving ZFS Data Streams +&prompt.root; zfs list +NAME USED AVAIL REFER MOUNTPOINT +backup 55.4M 240G 152K /backup +backup/mypool 55.3M 240G 55.2M /backup/mypool +mypool 55.6M 11.6G 55.0M /mypool - Up until now, only the data streams in binary form were - sent to other pools. To get to the actual data contained in - those streams, the reverse operation of zfs - send has to be used to transform the streams - back into files and directories. The command is called - zfs receive and has also a short version: - zfs recv. The example below combines - zfs send and zfs - receive using a pipe to copy the data from one - pool to another. This way, the data can be used directly on - the receiving pool after the transfer is complete. - - &prompt.root; zfs send mypool@backup1 | zfs receive backup/backup1 -&prompt.root; ls -lh /backup -total 431 -drwxr-xr-x 4219 root wheel 4.1k Dec 3 11:34 backup1 - - The directory backup1 does - contain all the data, which were part of the snapshot of the - same name. Since this originally was a complete filesystem - snapshot, the listing of all ZFS filesystems for this pool - is also updated and shows the - backup1 entry. +&prompt.root; zfs list -t snapshot +NAME USED AVAIL REFER MOUNTPOINT +backup/mypool@replica1 104K - 50.2M - +backup/mypool@replica2 0 - 55.2M - +mypool@replica1 29.9K - 50.0M - +mypool@replica2 0 - 55.0M - - &prompt.root; zfs list -NAME USED AVAIL REFER MOUNTPOINT -backup 43.7M 884M 32K /backup -backup/backup1 43.5M 884M 43.5M /backup/backup1 -mypool 50.0M 878M 44.1M /mypool - - A new filesystem, backup1 is - available and has the same size as the snapshot it was - created from. It is up to the user to decide whether the - streams should be transformed back into filesystems directly - to have a cold-standby for emergencies or to just keep the - streams and transform them later when required. Sending and - receiving can be automated so that regular backups are - created on a second pool for backup purposes. + The incremental stream was successfully transferred and + only the data that has changed was replicated, rather than + the entirety of replica1 and + replica2 with both contain mostly + the same data. The transmitted data only contains the + differences, which took much less time to transfer and saves + disk space by not copying the complete pool each time. This + is useful when having to rely on slow networks or when costs + per transferred byte have to be considered. + + A new filesystem, + backup/mypool is + available and has all of the files and data from the pool + mypool. If + is specified, the properties of the dataset will be copied, + including compression settings, quotas and mount points. If + is specified all child datasets of the + indicated dataset will be copied, along with all of their + properties. Sending and receiving can be automated so that + regular backups are created on the second pool. @@ -1454,27 +1534,26 @@ mypool 50.0M 878M 44. Although sending streams to another system over the network is a good way to keep a remote backup, it does come - with a drawback. All the data sent over the network link is - not encrypted, allowing anyone to intercept and transform - the streams back into data without the knowledge of the - sending user. This is an unacceptable situation, especially - when sending the streams over the internet to a remote host - with multiple hops in between where such malicious data - collection can occur. Fortunately, there is a solution - available to the problem that does not require the - encryption of the data on the pool itself. To make sure the - network connection between both systems is securely - encrypted, SSH can be used. - Since ZFS only requires the stream to be redirected from - standard output, it is relatively easy to pipe it through - SSH. + with a drawback. Data sent over the network link is not + encrypted, allowing anyone to intercept and transform the + streams back into data without the knowledge of the sending + user. This is undesirable, especially when sending the + streams over the internet to a remote host. + SSH can be used to securely + encrypt data send over a network connection. Since + ZFS only requires the stream to be + redirected from standard output, it is relatively easy to + pipe it through SSH. To keep + the contents of the file system encrypted in transit and + on the remote system, consider using PEFS. A few settings and security precautions have to be made - before this can be done. Since this chapter is about ZFS - and not about configuring SSH, it only lists the things - required to perform the encrypted zfs - send operation. The following settings should - be made: + before this can be done. Since this chapter is about + ZFS and not about configuring SSH, it + only lists the things required to perform the + zfs send operation. The following + configuration is required: @@ -1483,50 +1562,71 @@ mypool 50.0M 878M 44. - The root - user needs to be able to log into the receiving system - because only that user can send streams from the pool. - SSH should be configured so - that root can - only execute zfs recv and nothing - else to prevent users that might have hijacked this - account from doing any harm on the system. + Normally, the privileges of the + root user are + required to send and receive ZFS + streams. This requires logging in to the receiving + system as + root. + However, logging in as root is disabled by default for + security reasons. The ZFS Delegation system + can be used to allow a non-root user on each system to + perform the respective send and receieve + operations. + + + + On the sending system: + &prompt.root; zfs allow -u someuser send,snapshot mypool + + + + In order for the pool to mounted, the unprivileged + user must own the directory, and regular users must be + allowed to mount file systems. On the receiving + system: + + &prompt.root; sysctl vfs.usermount=1 +vfs.usermount: 0 -> 1 +&prompt.root; echo vfs.usermount=1 >> /etc/sysctl.conf +&prompt.root; zfs create recvpool/backup +&prompt.root; zfs allow -u someuser create,mount,receive recvpool/backup +&prompt.root; chown someuser /recvpool/backup - After these security measures have been put into place - and root can - connect via passwordless SSH to - the receiving system, the encrypted stream can be sent using - the following commands: + The unprivileged user can now receieve and mount the + replicated stream. Then the pool can be replicated: - &prompt.root; zfs snapshot -r mypool/home@monday -&prompt.root; zfs send -R mypool/home@monday | ssh backuphost zfs recv -dvu backuppool + &prompt.user; zfs snapshot -r mypool/home@monday +&prompt.user; zfs send -R mypool/home@monday | ssh someuser@backuphost zfs recv -dvu recvpool/backup The first command creates a recursive snapshot (option - -r) called - monday of the filesystem named + ) called + monday of the filesystem dataset home that resides on the pool mypool. The second command uses - the -R option to zfs - send, which makes sure that all datasets and - filesystems along with their children are included in the - transmission of the data stream. This also includes + zfs send with , which + makes sure that the dataset and all child datasets are + included in the transmitted data stream. This also includes snaphots, clones and settings on individual filesystems as - well. The output is piped directly to SSH that uses a short - name for the receiving host called - backuphost. A fully qualified - domain name or IP address can also be used here. The SSH - command to execute is zfs recv to a pool - called backuppool. Using the - -d option with zfs - recv will remove the original name of the pool - on the receiving side and just takes the name of the - snapshot instead. The -u option makes - sure that the filesystem is not mounted on the receiving - side. More information about the transfer—like the - time that has passed—is displayed when the - -v option is provided. + well. The output is piped to the waiting + zfs receive on the remote host + backuphost via + SSH. A fully qualified domain + name or IP address should be used here. The receiving + machine will write the data to + backup dataset on the + recvpool pool. Using + with zfs recv + will remove the original name of the pool on the receiving + side and just takes the name of the snapshot instead. + causes the filesystem(s) to not be + mounted on the receiving side. When is + included, more detail about the transfer is shown. + Included are elapsed time and the amount of data + transferred. @@ -1676,12 +1776,6 @@ mypool 50.0M 878M 44. &prompt.root; zfs get refreservation storage/home/bob - - Compression - - - - Deduplication @@ -1778,8 +1872,80 @@ dedup = 1.05, compress = 1.11, copies = due to the much lower memory requirements. + + Compression + + ZFS provides transparent compression. + Compressing data at the block level as it is written not only + saves storage space, but can also result in higher disk + throughput than would otherwise be possible. If data is + compressed by 25%, then the compressed data can be written to + the disk at the same rate as the uncompressed version, + resulting in an effective write speed of 125% of what would + normally be possible. Compression can also be a great + alternative to + Deduplication + because it does not require additional memory to store a + DDT. + + ZFS offers a number of different + compression algorithms to choose from, each with different + trade-offs. With the introduction of LZ4 + compression in ZFS v5000, it is possible + to enable compression for the entire pool without the large + performance trade-off of other algorithms. The biggest + advantage to LZ4 is the + early abort feature. If + LZ4 does not achieve atleast 12.5% + compression in the first part of the data, the block is + written uncompressed to avoid wasting CPU cycles trying to + compress data that is either already compressed or + uncompressible. For details about the different compression + algorithms available in ZFS, see the + Compression entry + in the terminology section. + + The administrator can monitor the effectiveness of + ZFS compression using a number of dataset + properties. + + &prompt.root; zfs get used,compressratio,compression,logicalused mypool/compressed_dataset +NAME PROPERTY VALUE SOURCE +mypool/compressed_dataset used 449G - +mypool/compressed_dataset compressratio 1.11x - +mypool/compressed_dataset compression lz4 local +mypool/compressed_dataset logicalused 496G - + + The dataset is currently using 449 GB of storage + space (the used property). If this dataset was not compressed + it would have taken 496 GB of space (the logicallyused + property). This results in the 1.11:1 compression + ratio. + + Compression can have an unexpected side effect when + combined with + User Quotas. + ZFS user quotas restrict how much space + a user can consume on a dataset, however the measurements are + based on how much data is stored, after compression. So if a + user has a quota of 10 GB, and writes 10 GB of + compressible data, they will still be able to store additional + data. If they later update a file, say a database, with more + or less compressible data, the amount of space available to + them will change. This can result in the odd situation where + a user did not increase the actual amount of data (the + logicalused property), but the change in + compression means they have now reached their quota. + + Compression can have a similar unexpected interaction with + backups. Quotas are often used to limit how much data can be + stored to ensure there is sufficient backup space available. + However since quotas do not consider compression, more data + may be written than will fit in uncompressed backups. + + - ZFS and Jails + <acronym>ZFS</acronym> and Jails zfs jail and the corresponding jailed property are used to delegate a @@ -1843,22 +2009,22 @@ dedup = 1.05, compress = 1.11, copies = - ZFS Advanced Topics + <acronym>ZFS</acronym> Advanced Topics - ZFS Tuning + <acronym>ZFS</acronym> Tuning - Booting Root on ZFS + Booting Root on <acronym>ZFS</acronym> - ZFS Boot Environments + <acronym>ZFS</acronym> Boot Environments @@ -1870,7 +2036,7 @@ dedup = 1.05, compress = 1.11, copies = - ZFS on i386 + <acronym>ZFS</acronym> on i386 Some of the features provided by ZFS are memory intensive, and may require tuning for maximum @@ -1942,38 +2108,46 @@ vfs.zfs.vdev.cache.size="5M" FreeBSD - Wiki - ZFS + Wiki - ZFS FreeBSD - Wiki - ZFS Tuning + Wiki - ZFS Tuning Illumos - Wiki - ZFS + Wiki - ZFS Oracle - Solaris ZFS Administration Guide + Solaris ZFS Administration + Guide ZFS + xlink:href="http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide">ZFS Evil Tuning Guide ZFS + xlink:href="http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide">ZFS Best Practices Guide + + + Calomel + Blog - ZFS Raidz Performance, Capacity + and Integrity + @@ -2083,10 +2257,9 @@ vfs.zfs.vdev.cache.size="5M" regular single disk vdev can be upgraded to - a mirror vdev at any time using the - zpool attach - command. + a mirror vdev at any time using zpool + attach. @@ -2296,14 +2469,16 @@ vfs.zfs.vdev.cache.size="5M"Dataset is the generic term for a ZFS file system, volume, snapshot or clone. Each dataset has a unique name in - the format: poolname/path@snapshot. + the format: + poolname/path@snapshot. The root of the pool is technically a dataset as well. Child datasets are named hierarchically like directories. For example, - mypool/home, the home dataset, is a - child of mypool and inherits - properties from it. This can be expanded further by - creating mypool/home/user. This + mypool/home, the home + dataset, is a child of mypool + and inherits properties from it. This can be expanded + further by creating + mypool/home/user. This grandchild dataset will inherity properties from the parent and grandparent. Properties on a child can be set to override the defaults inherited from the parents @@ -2440,19 +2615,77 @@ vfs.zfs.vdev.cache.size="5M" Compression - Each dataset in ZFS has a - compression property, which defaults to off. This - property can be set to one of a number of compression - algorithms, which will cause all new data that is - written to the dataset to be compressed. In addition to - the reduction in disk usage, this can also increase read - and write throughput, as only the smaller compressed - version of the file needs to be read or written. + Each dataset has a compression property, which + defaults to off. This property can be set to one of a + number of compression algorithms. This will cause all + new data that is written to the dataset to be + compressed. In addition to the reduction in disk usage, + this can also increase read and write throughput, as + only the smaller compressed version of the file needs to + be read or written. - - LZ4 compression is only - available after &os; 9.2. - + + + LZ4 - + was added in ZFS pool version + 5000 (feature flags), and is now the recommended + compression algorithm. LZ4 + compresses approximately 50% faster than + LZJB when operating on + compressible data, and is over three times faster + when operating on uncompressible data. + LZ4 also decompresses + approximately 80% faster than + LZJB. On modern CPUs, + LZ4 can often compress at over + 500 MB/s, and decompress at over + 1.5 GB/s (per single CPU core). + + + LZ4 compression is + only available after &os; 9.2. + + + + + LZJB - + is the default compression algorithm in + ZFS. Created by Jeff Bonwick + (one of the original creators of + ZFS). LZJB + offers good compression with less + CPU overhead compared to + GZIP. In the future, the + default compression algorithm will likely change + to LZ4. + + + + GZIP - + is a popular stream compression algorithm and is + available in ZFS. One of the + main advantages of using GZIP + is its configurable level of compression. When + setting the compress property, + the administrator can choose which level of + compression to use, ranging from + gzip1, the lowest level of + compression, and gzip9, the + higher level of compression. This gives the + administrator control over how much + CPU time to trade for saved + disk space. + + + + ZLE - + (zero length encoding) is a special compression + algorithm that only compresses continuous runs of + zeros. This compression algorithm is only useful + when the dataset contains large, continous runs of + zeros. + + @@ -2508,10 +2741,12 @@ vfs.zfs.vdev.cache.size="5M"scrub is run - at least once each quarter. Checksums of each block are - tested as they are read in normal use, but a scrub - operation makes sure even infrequently used blocks are - checked for silent corruption. + at least once every three months. The checksums of each *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***