Date: Tue, 8 Apr 2014 15:18:10 +0000 (UTC) From: Dru Lavigne <dru@FreeBSD.org> To: doc-committers@freebsd.org, svn-doc-all@freebsd.org, svn-doc-head@freebsd.org Subject: svn commit: r44485 - head/en_US.ISO8859-1/books/handbook/disks Message-ID: <201404081518.s38FIAqx070304@svn.freebsd.org>
next in thread | raw e-mail | index | archive | help
Author: dru Date: Tue Apr 8 15:18:09 2014 New Revision: 44485 URL: http://svnweb.freebsd.org/changeset/doc/44485 Log: Editorial review of first 1/2 of HAST chapter. Sponsored by: iXsystems Modified: head/en_US.ISO8859-1/books/handbook/disks/chapter.xml Modified: head/en_US.ISO8859-1/books/handbook/disks/chapter.xml ============================================================================== --- head/en_US.ISO8859-1/books/handbook/disks/chapter.xml Tue Apr 8 01:16:41 2014 (r44484) +++ head/en_US.ISO8859-1/books/handbook/disks/chapter.xml Tue Apr 8 15:18:09 2014 (r44485) @@ -3297,7 +3297,7 @@ Device 1K-blocks Used Av <sect1 xml:id="disks-hast"> <info> - <title>Highly Available Storage (HAST)</title> + <title>Highly Available Storage (<acronym>HAST</acronym>)</title> <authorgroup> <author> @@ -3348,75 +3348,24 @@ Device 1K-blocks Used Av <para>High availability is one of the main requirements in serious business applications and highly-available storage is a - key component in such environments. Highly Available STorage, - or <acronym>HAST<remark role="acronym">Highly - Available STorage</remark></acronym>, was developed by - &a.pjd.email; as a framework which allows transparent storage of + key component in such environments. In &os;, the Highly Available STorage + (<acronym>HAST</acronym>) + framework allows transparent storage of the same data across several physically separated machines - connected by a TCP/IP network. <acronym>HAST</acronym> can be + connected by a <acronym>TCP/IP</acronym> network. <acronym>HAST</acronym> can be understood as a network-based RAID1 (mirror), and is similar to - the DRBD® storage system known from the GNU/&linux; + the DRBD® storage system used in the GNU/&linux; platform. In combination with other high-availability features of &os; like <acronym>CARP</acronym>, <acronym>HAST</acronym> makes it possible to build a highly-available storage cluster that is resistant to hardware failures.</para> - <para>After reading this section, you will know:</para> - - <itemizedlist> - <listitem> - <para>What <acronym>HAST</acronym> is, how it works and - which features it provides.</para> - </listitem> - - <listitem> - <para>How to set up and use <acronym>HAST</acronym> on - &os;.</para> - </listitem> - - <listitem> - <para>How to integrate <acronym>CARP</acronym> and - &man.devd.8; to build a robust storage system.</para> - </listitem> - </itemizedlist> - - <para>Before reading this section, you should:</para> - - <itemizedlist> - <listitem> - <para>Understand &unix; and <link - linkend="basics">&os; basics</link>.</para> - </listitem> - - <listitem> - <para>Know how to <link - linkend="config-tuning">configure</link> network - interfaces and other core &os; subsystems.</para> - </listitem> - - <listitem> - <para>Have a good understanding of <link - linkend="network-communication">&os; - networking</link>.</para> - </listitem> - </itemizedlist> - - <para>The <acronym>HAST</acronym> project was sponsored by The - &os; Foundation with support from <link - xlink:href="http://www.omc.net/">OMCnet Internet Service - GmbH</link> and <link - xlink:href="http://www.transip.nl/">TransIP - BV</link>.</para> - - <sect2> - <title>HAST Features</title> - - <para>The main features of the <acronym>HAST</acronym> system - are:</para> + <para>The following are the main features of + <acronym>HAST</acronym>:</para> <itemizedlist> <listitem> - <para>Can be used to mask I/O errors on local hard + <para>Can be used to mask <acronym>I/O</acronym> errors on local hard drives.</para> </listitem> @@ -3426,9 +3375,9 @@ Device 1K-blocks Used Av </listitem> <listitem> - <para>Efficient and quick resynchronization, synchronizing - only blocks that were modified during the downtime of a - node.</para> + <para>Efficient and quick resynchronization as + only the blocks that were modified during the downtime of a + node are synchronized.</para> </listitem> <!-- @@ -3450,64 +3399,94 @@ Device 1K-blocks Used Av system.</para> </listitem> </itemizedlist> - </sect2> + + <para>After reading this section, you will know:</para> + + <itemizedlist> + <listitem> + <para>What <acronym>HAST</acronym> is, how it works, and + which features it provides.</para> + </listitem> + + <listitem> + <para>How to set up and use <acronym>HAST</acronym> on + &os;.</para> + </listitem> + + <listitem> + <para>How to integrate <acronym>CARP</acronym> and + &man.devd.8; to build a robust storage system.</para> + </listitem> + </itemizedlist> + + <para>Before reading this section, you should:</para> + + <itemizedlist> + <listitem> + <para>Understand &unix; and &os; basics (<xref + linkend="basics"/>).</para> + </listitem> + + <listitem> + <para>Know how to configure network + interfaces and other core &os; subsystems (<xref + linkend="config-tuning"/>).</para> + </listitem> + + <listitem> + <para>Have a good understanding of &os; + networking (<xref + linkend="network-communication"/>).</para> + </listitem> + </itemizedlist> + + <para>The <acronym>HAST</acronym> project was sponsored by The + &os; Foundation with support from <link + xlink:href="http://www.omc.net/">http://www.omc.net/</link> and <link + xlink:href="http://www.transip.nl/">http://www.transip.nl/</link>.</para> <sect2> <title>HAST Operation</title> - <para>As <acronym>HAST</acronym> provides a synchronous - block-level replication of any storage media to several - machines, it requires at least two physical machines: - the <literal>primary</literal>, also known as the - <literal>master</literal> node, and the - <literal>secondary</literal> or <literal>slave</literal> + <para><acronym>HAST</acronym> provides synchronous + block-level replication between two + physical machines: + the <emphasis>primary</emphasis>, also known as the + <emphasis>master</emphasis> node, and the + <emphasis>secondary</emphasis>, or <emphasis>slave</emphasis> node. These two machines together are referred to as a cluster.</para> - <note> - <para>HAST is currently limited to two cluster nodes in - total.</para> - </note> - <para>Since <acronym>HAST</acronym> works in a primary-secondary configuration, it allows only one of the cluster nodes to be active at any given time. The - <literal>primary</literal> node, also called - <literal>active</literal>, is the one which will handle all - the I/O requests to <acronym>HAST</acronym>-managed - devices. The <literal>secondary</literal> node is - automatically synchronized from the <literal>primary</literal> + primary node, also called + <emphasis>active</emphasis>, is the one which will handle all + the <acronym>I/O</acronym> requests to <acronym>HAST</acronym>-managed + devices. The secondary node is + automatically synchronized from the primary node.</para> <para>The physical components of the <acronym>HAST</acronym> - system are:</para> - - <itemizedlist> - <listitem> - <para>local disk on primary node, and</para> - </listitem> - - <listitem> - <para>disk on remote, secondary node.</para> - </listitem> - </itemizedlist> + system are the local disk on primary node, and the + disk on the remote, secondary node.</para> <para><acronym>HAST</acronym> operates synchronously on a block level, making it transparent to file systems and applications. <acronym>HAST</acronym> provides regular GEOM providers in <filename>/dev/hast/</filename> for use by - other tools or applications, thus there is no difference + other tools or applications. There is no difference between using <acronym>HAST</acronym>-provided devices and raw disks or partitions.</para> - <para>Each write, delete, or flush operation is sent to the - local disk and to the remote disk over TCP/IP. Each read + <para>Each write, delete, or flush operation is sent to both the + local disk and to the remote disk over <acronym>TCP/IP</acronym>. Each read operation is served from the local disk, unless the local disk - is not up-to-date or an I/O error occurs. In such case, the + is not up-to-date or an <acronym>I/O</acronym> error occurs. In such cases, the read operation is sent to the secondary node.</para> <para><acronym>HAST</acronym> tries to provide fast failure - recovery. For this reason, it is very important to reduce + recovery. For this reason, it is important to reduce synchronization time after a node's outage. To provide fast synchronization, <acronym>HAST</acronym> manages an on-disk bitmap of dirty extents and only synchronizes those during a @@ -3520,29 +3499,29 @@ Device 1K-blocks Used Av <itemizedlist> <listitem> - <para><emphasis>memsync</emphasis>: report write operation + <para><emphasis>memsync</emphasis>: This mode reports a write operation as completed when the local write operation is finished and when the remote node acknowledges data arrival, but before actually storing the data. The data on the remote node will be stored directly after sending the acknowledgement. This mode is intended to reduce - latency, but still provides very good + latency, but still provides good reliability.</para> </listitem> <listitem> - <para><emphasis>fullsync</emphasis>: report write - operation as completed when local write completes and - when remote write completes. This is the safest and the + <para><emphasis>fullsync</emphasis>: This mode reports a write + operation as completed when both the local write and the + remote write complete. This is the safest and the slowest replication mode. This mode is the default.</para> </listitem> <listitem> - <para><emphasis>async</emphasis>: report write operation as - completed when local write completes. This is the + <para><emphasis>async</emphasis>: This mode reports a write operation as + completed when the local write completes. This is the fastest and the most dangerous replication mode. It - should be used when replicating to a distant node where + should only be used when replicating to a distant node where latency is too high for other modes.</para> </listitem> </itemizedlist> @@ -3551,65 +3530,64 @@ Device 1K-blocks Used Av <sect2> <title>HAST Configuration</title> - <para><acronym>HAST</acronym> requires - <literal>GEOM_GATE</literal> support which is not present in - the default <literal>GENERIC</literal> kernel. However, the - <varname>geom_gate.ko</varname> loadable module is available - in the default &os; installation. Alternatively, to build - <literal>GEOM_GATE</literal> support into the kernel - statically, add this line to the custom kernel configuration - file:</para> - - <programlisting>options GEOM_GATE</programlisting> - <para>The <acronym>HAST</acronym> framework consists of several - parts from the operating system's point of view:</para> + components:</para> <itemizedlist> <listitem> - <para>the &man.hastd.8; daemon responsible for data - synchronization,</para> + <para>The &man.hastd.8; daemon which provides data + synchronization. When this daemon is started, it will + automatically load <varname>geom_gate.ko</varname>.</para> </listitem> <listitem> - <para>the &man.hastctl.8; userland management - utility,</para> + <para>The userland management + utility, &man.hastctl.8;.</para> </listitem> <listitem> - <para>and the &man.hast.conf.5; configuration file.</para> + <para>The &man.hast.conf.5; configuration file. This file + must exist before starting + <application>hastd</application>.</para> </listitem> </itemizedlist> + <para>Users who prefer to statically build + <literal>GEOM_GATE</literal> support into the kernel + should add this line to the custom kernel configuration + file, then rebuild the kernel using the instructions in <xref + linkend="kernelconfig"/>:</para> + + <programlisting>options GEOM_GATE</programlisting> + <para>The following example describes how to configure two nodes - in <literal>master</literal>-<literal>slave</literal> / - <literal>primary</literal>-<literal>secondary</literal> + in master-slave/primary-secondary operation using <acronym>HAST</acronym> to replicate the data between the two. The nodes will be called - <literal><replaceable>hasta</replaceable></literal> with an IP address of - <replaceable>172.16.0.1</replaceable> and - <literal><replaceable>hastb</replaceable></literal> with an IP of address - <replaceable>172.16.0.2</replaceable>. Both nodes will have a - dedicated hard drive <filename>/dev/<replaceable>ad6</replaceable></filename> of the same + <literal>hasta</literal>, with an <acronym>IP</acronym> address of + <literal>172.16.0.1</literal>, and + <literal>hastb</literal>, with an <acronym>IP</acronym> of address + <literal>172.16.0.2</literal>. Both nodes will have a + dedicated hard drive <filename>/dev/ad6</filename> of the same size for <acronym>HAST</acronym> operation. The - <acronym>HAST</acronym> pool, sometimes also referred to as a - resource or the GEOM provider in + <acronym>HAST</acronym> pool, sometimes referred to as a + resource or the <acronym>GEOM</acronym> provider in <filename class="directory">/dev/hast/</filename>, will be called - <filename><replaceable>test</replaceable></filename>.</para> + <literal>test</literal>.</para> <para>Configuration of <acronym>HAST</acronym> is done using - <filename>/etc/hast.conf</filename>. This file should be the - same on both nodes. The simplest configuration possible + <filename>/etc/hast.conf</filename>. This file should be + identical on both nodes. The simplest configuration is:</para> - <programlisting>resource test { - on hasta { - local /dev/ad6 - remote 172.16.0.2 + <programlisting>resource <replaceable>test</replaceable> { + on <replaceable>hasta</replaceable> { + local <replaceable>/dev/ad6</replaceable> + remote <replaceable>172.16.0.2</replaceable> } - on hastb { - local /dev/ad6 - remote 172.16.0.1 + on <replaceable>hastb</replaceable> { + local <replaceable>/dev/ad6</replaceable> + remote <replaceable>172.16.0.1</replaceable> } }</programlisting> @@ -3618,18 +3596,18 @@ Device 1K-blocks Used Av <tip> <para>It is also possible to use host names in the - <literal>remote</literal> statements. In such a case, make - sure that these hosts are resolvable and are defined in + <literal>remote</literal> statements if + the hosts are resolvable and defined either in <filename>/etc/hosts</filename> or in the local <acronym>DNS</acronym>.</para> </tip> - <para>Now that the configuration exists on both nodes, + <para>Once the configuration exists on both nodes, the <acronym>HAST</acronym> pool can be created. Run these commands on both nodes to place the initial metadata onto the local disk and to start &man.hastd.8;:</para> - <screen>&prompt.root; <userinput>hastctl create test</userinput> + <screen>&prompt.root; <userinput>hastctl create <replaceable>test</replaceable></userinput> &prompt.root; <userinput>service hastd onestart</userinput></screen> <note> @@ -3646,50 +3624,40 @@ Device 1K-blocks Used Av administrator, or software like <application>Heartbeat</application>, using &man.hastctl.8;. On the primary node, - <literal><replaceable>hasta</replaceable></literal>, issue + <literal>hasta</literal>, issue this command:</para> - <screen>&prompt.root; <userinput>hastctl role primary test</userinput></screen> - - <para>Similarly, run this command on the secondary node, - <literal><replaceable>hastb</replaceable></literal>:</para> + <screen>&prompt.root; <userinput>hastctl role primary <replaceable>test</replaceable></userinput></screen> - <screen>&prompt.root; <userinput>hastctl role secondary test</userinput></screen> + <para>Run this command on the secondary node, + <literal>hastb</literal>:</para> - <caution> - <para>When the nodes are unable to communicate with each - other, and both are configured as primary nodes, the - condition is called <literal>split-brain</literal>. To - troubleshoot this situation, follow the steps described in - <xref linkend="disks-hast-sb"/>.</para> - </caution> + <screen>&prompt.root; <userinput>hastctl role secondary <replaceable>test</replaceable></userinput></screen> - <para>Verify the result by running &man.hastctl.8; on each + <para>Verify the result by running <command>hastctl</command> on each node:</para> - <screen>&prompt.root; <userinput>hastctl status test</userinput></screen> + <screen>&prompt.root; <userinput>hastctl status <replaceable>test</replaceable></userinput></screen> - <para>The important text is the <literal>status</literal> line, - which should say <literal>complete</literal> - on each of the nodes. If it says <literal>degraded</literal>, - something went wrong. At this point, the synchronization - between the nodes has already started. The synchronization + <para>Check the <literal>status</literal> line in the output. + If it says <literal>degraded</literal>, + something is wrong with the configuration file. It should say <literal>complete</literal> + on each node, meaning that the synchronization + between the nodes has started. The synchronization completes when <command>hastctl status</command> reports 0 bytes of <literal>dirty</literal> extents.</para> <para>The next step is to create a file system on the - <filename>/dev/hast/<replaceable>test</replaceable></filename> - GEOM provider and mount it. This must be done on the - <literal>primary</literal> node, as - <filename>/dev/hast/<replaceable>test</replaceable></filename> - appears only on the <literal>primary</literal> node. Creating + <acronym>GEOM</acronym> provider and mount it. This must be done on the + <literal>primary</literal> node. Creating the file system can take a few minutes, depending on the size - of the hard drive:</para> + of the hard drive. This example creates a <acronym>UFS</acronym> + file system on <filename>/dev/hast/test</filename>:</para> - <screen>&prompt.root; <userinput>newfs -U /dev/hast/test</userinput> -&prompt.root; <userinput>mkdir /hast/test</userinput> -&prompt.root; <userinput>mount /dev/hast/test /hast/test</userinput></screen> + <screen>&prompt.root; <userinput>newfs -U /dev/hast/<replaceable>test</replaceable></userinput> +&prompt.root; <userinput>mkdir /hast/<replaceable>test</replaceable></userinput> +&prompt.root; <userinput>mount /dev/hast/<replaceable>test</replaceable> <replaceable>/hast/test</replaceable></userinput></screen> <para>Once the <acronym>HAST</acronym> framework is configured properly, the final step is to make sure that
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201404081518.s38FIAqx070304>