From owner-freebsd-doc@FreeBSD.ORG Thu Aug 18 23:00:21 2011 Return-Path: Delivered-To: freebsd-doc@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D223106566C for ; Thu, 18 Aug 2011 23:00:21 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5AAB78FC0C for ; Thu, 18 Aug 2011 23:00:21 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p7IN0Ls0079951 for ; Thu, 18 Aug 2011 23:00:21 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p7IN0LUO079950; Thu, 18 Aug 2011 23:00:21 GMT (envelope-from gnats) Resent-Date: Thu, 18 Aug 2011 23:00:21 GMT Resent-Message-Id: <201108182300.p7IN0LUO079950@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-doc@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Warren Block Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 780681065675 for ; Thu, 18 Aug 2011 22:53:00 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from red.freebsd.org (red.freebsd.org [IPv6:2001:4f8:fff6::22]) by mx1.freebsd.org (Postfix) with ESMTP id 67D078FC1A for ; Thu, 18 Aug 2011 22:53:00 +0000 (UTC) Received: from red.freebsd.org (localhost [127.0.0.1]) by red.freebsd.org (8.14.4/8.14.4) with ESMTP id p7IMr0mr086595 for ; Thu, 18 Aug 2011 22:53:00 GMT (envelope-from nobody@red.freebsd.org) Received: (from nobody@localhost) by red.freebsd.org (8.14.4/8.14.4/Submit) id p7IMr0us086588; Thu, 18 Aug 2011 22:53:00 GMT (envelope-from nobody) Message-Id: <201108182253.p7IMr0us086588@red.freebsd.org> Date: Thu, 18 Aug 2011 22:53:00 GMT From: Warren Block To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: docs/159897: [patch] improve HAST section of Handbook X-BeenThere: freebsd-doc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Documentation project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Aug 2011 23:00:21 -0000 >Number: 159897 >Category: docs >Synopsis: [patch] improve HAST section of Handbook >Confidential: no >Severity: non-critical >Priority: low >Responsible: freebsd-doc >State: open >Quarter: >Keywords: >Date-Required: >Class: doc-bug >Submitter-Id: current-users >Arrival-Date: Thu Aug 18 23:00:21 UTC 2011 >Closed-Date: >Last-Modified: >Originator: Warren Block >Release: 8-STABLE >Organization: >Environment: FreeBSD lightning 8.2-STABLE FreeBSD 8.2-STABLE #0: Wed Aug 17 19:31:39 MDT 2011 root@lightning:/usr/obj/usr/src/sys/LIGHTNING i386 >Description: Edit and polish the HAST section of the Handbook with an eye to conciseness and clarity. >How-To-Repeat: >Fix: Apply patch. Patch attached with submission follows: --- en_US.ISO8859-1/books/handbook/disks/chapter.sgml.orig 2011-08-18 15:22:56.000000000 -0600 +++ en_US.ISO8859-1/books/handbook/disks/chapter.sgml 2011-08-18 16:35:46.000000000 -0600 @@ -4038,7 +4038,7 @@ Synopsis - High-availability is one of the main requirements in serious + High availability is one of the main requirements in serious business applications and highly-available storage is a key component in such environments. Highly Available STorage, or HASTHighly Available @@ -4109,7 +4109,7 @@ drives. - File system agnostic, thus allowing to use any file + File system agnostic, thus allowing use of any file system supported by &os;. @@ -4152,7 +4152,7 @@ total. - Since the HAST works in + Since HAST works in primary-secondary configuration, it allows only one of the cluster nodes to be active at any given time. The primary node, also called @@ -4175,7 +4175,7 @@ HAST operates synchronously on a block - level, which makes it transparent for file systems and + level, making it transparent to file systems and applications. HAST provides regular GEOM providers in /dev/hast/ directory for use by other tools or applications, thus there is @@ -4252,7 +4252,7 @@ For stripped-down systems, make sure this module is available. Alternatively, it is possible to build GEOM_GATE support into the kernel - statically, by adding the following line to the custom kernel + statically, by adding this line to the custom kernel configuration file: options GEOM_GATE @@ -4290,10 +4290,10 @@ class="directory">/dev/hast/) will be called test. - The configuration of HAST is being done + Configuration of HAST is done in the /etc/hast.conf file. This file should be the same on both nodes. The simplest configuration - possible is following: + possible is: resource test { on hasta { @@ -4317,9 +4317,9 @@ alternatively in the local DNS. - Now that the configuration exists on both nodes, it is - possible to create the HAST pool. Run the - following commands on both nodes to place the initial metadata + Now that the configuration exists on both nodes, + the HAST pool can be created. Run these + commands on both nodes to place the initial metadata onto the local disk, and start the &man.hastd.8; daemon: &prompt.root; hastctl create test @@ -4334,51 +4334,51 @@ available. - HAST is not responsible for selecting node's role - (primary or secondary). - Node's role has to be configured by an administrator or other - software like Heartbeat using the + A HAST node's role (primary or + secondary) is selected by an administrator + or other + software like Heartbeat using the &man.hastctl.8; utility. Move to the primary node (hasta) and - issue the following command: + issue this command: &prompt.root; hastctl role primary test - Similarly, run the following command on the secondary node + Similarly, run this command on the secondary node (hastb): &prompt.root; hastctl role secondary test - It may happen that both of the nodes are not able to - communicate with each other and both are configured as - primary nodes; the consequence of this condition is called - split-brain. In order to troubleshoot + When the nodes are unable to + communicate with each other, and both are configured as + primary nodes, the condition is called + split-brain. To troubleshoot this situation, follow the steps described in . - It is possible to verify the result with the + Verify the result with the &man.hastctl.8; utility on each node: &prompt.root; hastctl status test - The important text is the status line - from its output and it should say complete + The important text is the status line, + which should say complete on each of the nodes. If it says degraded, something went wrong. At this point, the synchronization between the nodes has already started. The synchronization - completes when the hastctl status command + completes when hastctl status reports 0 bytes of dirty extents. - The last step is to create a filesystem on the + The next step is to create a filesystem on the /dev/hast/test - GEOM provider and mount it. This has to be done on the - primary node (as the + GEOM provider and mount it. This must be done on the + primary node, as /dev/hast/test - appears only on the primary node), and - it can take a few minutes depending on the size of the hard + appears only on the primary node. + It can take a few minutes depending on the size of the hard drive: &prompt.root; newfs -U /dev/hast/test @@ -4387,9 +4387,9 @@ Once the HAST framework is configured properly, the final step is to make sure that - HAST is started during the system boot time - automatically. The following line should be added to the - /etc/rc.conf file: + HAST is started automatically during the system + boot. This line is added to + /etc/rc.conf: hastd_enable="YES" @@ -4397,26 +4397,25 @@ Failover Configuration The goal of this example is to build a robust storage - system which is resistant from the failures of any given node. - The key task here is to remedy a scenario when a - primary node of the cluster fails. Should - it happen, the secondary node is there to + system which is resistant to failures of any given node. + The scenario is that a + primary node of the cluster fails. If + this happens, the secondary node is there to take over seamlessly, check and mount the file system, and continue to work without missing a single bit of data. - In order to accomplish this task, it will be required to - utilize another feature available under &os; which provides + To accomplish this task, another &os; feature provides for automatic failover on the IP layer — - CARP. CARP stands for - Common Address Redundancy Protocol and allows multiple hosts + CARP. CARP (Common Address + Redundancy Protocol) allows multiple hosts on the same network segment to share an IP address. Set up CARP on both nodes of the cluster according to the documentation available in . - After completing this task, each node should have its own + After setup, each node will have its own carp0 interface with a shared IP address 172.16.0.254. - Obviously, the primary HAST node of the - cluster has to be the master CARP + The primary HAST node of the + cluster must be the master CARP node. The HAST pool created in the previous @@ -4430,17 +4429,17 @@ In the event of CARP interfaces going up or down, the &os; operating system generates a &man.devd.8; - event, which makes it possible to watch for the state changes + event, making it possible to watch for the state changes on the CARP interfaces. A state change on the CARP interface is an indication that - one of the nodes failed or came back online. In such a case, - it is possible to run a particular script which will + one of the nodes failed or came back online. These state change + events make it possible to run a script which will automatically handle the failover. - To be able to catch the state changes on the - CARP interfaces, the following - configuration has to be added to the - /etc/devd.conf file on each node: + To be able to catch state changes on the + CARP interfaces, add this + configuration to + /etc/devd.conf on each node: notify 30 { match "system" "IFNET"; @@ -4456,12 +4455,12 @@ action "/usr/local/sbin/carp-hast-switch slave"; }; - To put the new configuration into effect, run the - following command on both nodes: + Restart &man.devd.8; on both nodes o put the new configuration + into effect: &prompt.root; /etc/rc.d/devd restart - In the event that the carp0 + When the carp0 interface goes up or down (i.e. the interface state changes), the system generates a notification, allowing the &man.devd.8; subsystem to run an arbitrary script, in this case @@ -4471,7 +4470,7 @@ &man.devd.8; configuration, please consult the &man.devd.conf.5; manual page. - An example of such a script could be following: + An example of such a script could be: #!/bin/sh @@ -4557,13 +4556,13 @@ ;; esac - In a nutshell, the script does the following when a node + In a nutshell, the script takes these actions when a node becomes master / primary: - Promotes the HAST pools as + Promotes the HAST pools to primary on a given node. @@ -4571,7 +4570,7 @@ HAST pool. - Mounts the pools at appropriate place. + Mounts the pools at an appropriate place. @@ -4590,15 +4589,15 @@ Keep in mind that this is just an example script which - should serve as a proof of concept solution. It does not + should serve as a proof of concept. It does not handle all the possible scenarios and can be extended or altered in any way, for example it can start/stop required - services etc. + services, etc. - For the purpose of this example we used a standard UFS - file system. In order to reduce the time needed for + For this example, we used a standard UFS + file system. To reduce the time needed for recovery, a journal-enabled UFS or ZFS file system can be used. @@ -4615,41 +4614,40 @@ General Troubleshooting Tips - HAST should be generally working - without any issues, however as with any other software + HAST should generally work + without issues. However, as with any other software product, there may be times when it does not work as supposed. The sources of the problems may be different, but the rule of thumb is to ensure that the time is synchronized between all nodes of the cluster. - The debugging level of the &man.hastd.8; should be - increased when troubleshooting HAST - problems. This can be accomplished by starting the + When troubleshooting HAST problems, + the debugging level of &man.hastd.8; should be increased + by starting the &man.hastd.8; daemon with the -d - argument. Note, that this argument may be specified + argument. Note that this argument may be specified multiple times to further increase the debugging level. A - lot of useful information may be obtained this way. It - should be also considered to use -F - argument, which will start the &man.hastd.8; daemon in + lot of useful information may be obtained this way. Consider + also using the -F + argument, which starts the &man.hastd.8; daemon in the foreground. Recovering from the Split-brain Condition - The consequence of a situation when both nodes of the - cluster are not able to communicate with each other and both - are configured as primary nodes is called - split-brain. This is a dangerous + Split-brain is when the nodes of the + cluster are unable to communicate with each other, and both + are configured as primary. This is a dangerous condition because it allows both nodes to make incompatible - changes to the data. This situation has to be handled by - the system administrator manually. + changes to the data. This problem must be corrected + manually by the system administrator. - In order to fix this situation the administrator has to + The administrator must decide which node has more important changes (or merge them - manually) and let the HAST perform + manually) and let HAST perform the full synchronization of the node which has the broken - data. To do this, issue the following commands on the node + data. To do this, issue these commands on the node which needs to be resynchronized: &prompt.root; hastctl role init <resource> >Release-Note: >Audit-Trail: >Unformatted: