From owner-svn-doc-projects@FreeBSD.ORG Fri Dec 6 13:13:19 2013 Return-Path: Delivered-To: svn-doc-projects@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E58BBD60; Fri, 6 Dec 2013 13:13:19 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id C6A141AC9; Fri, 6 Dec 2013 13:13:19 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id rB6DDJ20065773; Fri, 6 Dec 2013 13:13:19 GMT (envelope-from bcr@svn.freebsd.org) Received: (from bcr@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id rB6DDJ2t065772; Fri, 6 Dec 2013 13:13:19 GMT (envelope-from bcr@svn.freebsd.org) Message-Id: <201312061313.rB6DDJ2t065772@svn.freebsd.org> From: Benedict Reuschling Date: Fri, 6 Dec 2013 13:13:19 +0000 (UTC) To: doc-committers@freebsd.org, svn-doc-projects@freebsd.org Subject: svn commit: r43281 - projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs X-SVN-Group: doc-projects MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-doc-projects@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for doc projects trees List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Dec 2013 13:13:20 -0000 Author: bcr Date: Fri Dec 6 13:13:19 2013 New Revision: 43281 URL: http://svnweb.freebsd.org/changeset/doc/43281 Log: Add a section about ZFS self-healing. An example is shown where a mirrored pool is intentionally corrupted (with a big warning sign) and how ZFS copes with it. This is based on an example I did for lecture slides I created back in the days when the links to www.sun.com where still in the output of zpool status. This needs to be updated later with the links that are displayed now. Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Modified: projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml ============================================================================== --- projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Wed Dec 4 15:03:04 2013 (r43280) +++ projects/zfsupdate-201307/en_US.ISO8859-1/books/handbook/zfs/chapter.xml Fri Dec 6 13:13:19 2013 (r43281) @@ -620,6 +620,217 @@ errors: No known data errors restored from backups. + + ZFS Self-Healing + + ZFS utilizes the checkums stored with + each data block to provide a feature called self-healing. + This feature will automatically repair data whose checksum + does not match the one recorded on another device that is part + of the storage pool. For example, a mirror with two disks + where one drive is starting to malfunction and cannot properly + store the data anymore. This is even worse when the data has + not been accessed for a long time in long term archive storage + for example. Traditional file systems need to run algorithms + that check and repair the data like the &man.fsck.8; program. + These commands take time and in severe cases, an administrator + has to manually decide which repair operation has to be + performed. When ZFS detects that a data + block is being read whose checksum does not match, it will try + to read the data from the mirror disk. If that disk can + provide the correct data, it will not only give that data to + the application requesting it, but also correct the wrong data + on the disk that had the bad checksum. This happens without + any interaction of a system administrator during normal pool + operation. + + The following example will demonstrate this self-healing + behavior in ZFS. First, a mirrored pool of two disks + /dev/ada0 and + /dev/ada1 is created. + + &prompt.root; zpool create healer mirror /dev/ada0 /dev/ada1 +&prompt.root; zpool status healer + pool: healer + state: ONLINE + scan: none requested +config: + + NAME STATE READ WRITE CKSUM + healer ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 0 + +errors: No known data errors +&prompt.root; zpool list +NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT +healer 960M 92.5K 960M 0% 1.00x ONLINE - + + Now, some important data that we want to protect from data + errors using the self-healing feature is copied to the pool. + A checksum of the pool is then created to compare it against + the pool later on. + + &prompt.root; cp /some/important/data /healer +&prompt.root; zfs list +NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT +healer 960M 67.7M 892M 7% 1.00x ONLINE - +&prompt.root; sha1 /healer > checksum.txt +&prompt.root; cat checksum.txt +SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f + + Next, data corruption is simulated by writing random data + to the beginning of one of the disks that make up the mirror. + To prevent ZFS from healing the data as soon as it detects it, + we export the pool first and import it again + afterwards. + + + This is a dangerous operation that can destroy vital + data. It is shown here for demonstrational purposes only + and should not be attempted during normal operation of a ZFS + storage pool. Nor should this dd example + be run on a disk with a different filesystem on it. Do not + use any other disk device names other than the ones that are + part of the ZFS pool. Make sure that proper backups of the + pool are created before running the command! + + + &prompt.root; zpool export healer +&prompt.root; dd if=/dev/random of=/dev/ada1 bs=1m count=200 +200+0 records in +200+0 records out +209715200 bytes transferred in 62.992162 secs (3329227 bytes/sec) +&prompt.root; zpool import healer + + The ZFS pool status shows that one device has experienced + an error. It is important to know that applications reading + data from the pool did not receive any data with a wrong + checksum. ZFS did provide the application with the data from + the ada0 device that has the correct + checksums. The device with the wrong checksum can be found + easily as the CKSUM column contains a value + greater than zero. + + &prompt.root; zpool status healer + pool: healer + state: ONLINE + status: One or more devices has experienced an unrecoverable error. An + attempt was made to correct the error. Applications are unaffected. + action: Determine if the device needs to be replaced, and clear the errors + using 'zpool clear' or replace the device with 'zpool replace'. + see: http://www.sun.com/msg/ZFS-8000-9P + scan: none requested + config: + + NAME STATE READ WRITE CKSUM + healer ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 1 + +errors: No known data errors + + ZFS has detected the error and took care of it by using + the redundancy present in the unaffected + ada0 mirror disk. A checksum comparison + with the original one should reveal whether the pool is + consistent again. + + &prompt.root; sha1 /healer >> checksum.txt +&prompt.root; cat checksum.txt +SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f +SHA1 (/healer) = 2753eff56d77d9a536ece6694bf0a82740344d1f + + The two checksums that were generated before and after the + intentional tampering with the pool data still match. This + shows how ZFS is capable of detecting and correcting any + errors automatically when the checksums do not match anymore. + Note that this is only possible when there is enough + redundancy present in the pool. A pool consisting of a single + device has no self-healing capabilities. That is also the + reason why checksums are so important in ZFS and should not be + disabled for any reason. No &man.fsck.8; or similar + filesystem consistency check program is required to detect and + correct this and the pool was available the whole time. A + scrub operation is now required to remove the falsely written + data from ada1. + + &prompt.root; zpool scrub healer +&prompt.root; zpool status healer + pool: healer + state: ONLINE +status: One or more devices has experienced an unrecoverable error. An + attempt was made to correct the error. Applications are unaffected. +action: Determine if the device needs to be replaced, and clear the errors + using 'zpool clear' or replace the device with 'zpool replace'. + see: http://www.sun.com/msg/ZFS-8000-9P + scan: scrub in progress since Mon Dec 10 12:23:30 2012 + 10.4M scanned out of 67.0M at 267K/s, 0h3m to go + 9.63M repaired, 15.56% done +config: + + NAME STATE READ WRITE CKSUM + healer ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 627 (repairing) + +errors: No known data errors + + The scrub operation is reading the data from + ada0 and corrects all data that has a + wrong checksum on ada1. This is + indicated by the (repairing) output from + the zpool status command. After the + operation is complete, the pool status has changed to the + following: + + &prompt.root; zpool status healer + pool: healer + state: ONLINE +status: One or more devices has experienced an unrecoverable error. An + attempt was made to correct the error. Applications are unaffected. +action: Determine if the device needs to be replaced, and clear the errors + using 'zpool clear' or replace the device with 'zpool replace'. + see: http://www.sun.com/msg/ZFS-8000-9P + scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012 +config: + + NAME STATE READ WRITE CKSUM + healer ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 2.72K + +errors: No known data errors + + After the scrub operation has completed and all the data + has been synchronized from ada0 to + ada1, the error messages can be cleared + from the pool status by running zpool + clear. + + &prompt.root; zpool clear healer +&prompt.root; zpool status healer + pool: healer + state: ONLINE + scan: scrub repaired 66.5M in 0h2m with 0 errors on Mon Dec 10 12:26:25 2012 +config: + + NAME STATE READ WRITE CKSUM + healer ONLINE 0 0 0 + mirror-0 ONLINE 0 0 0 + ada0 ONLINE 0 0 0 + ada1 ONLINE 0 0 0 + +errors: No known data errors + + Our pool is now back to a fully working state and all the + errors have been cleared. + + Growing a Pool