Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Mar 2011 20:58:25 +0000 (UTC)
From:      Mikolaj Golub <trociny@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org
Subject:   svn commit: r220151 - in stable/8/sbin: hastctl hastd
Message-ID:  <201103292058.p2TKwPL6040796@svn.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: trociny
Date: Tue Mar 29 20:58:25 2011
New Revision: 220151
URL: http://svn.freebsd.org/changeset/base/220151

Log:
  MFC r219351, r219354, r219369, r219370, r219371, r219372, r219373,
    r219385, r219482, r219620, r219669, r219721, r219813, r219814,
    r219815, r219816, r219817, r219818, r219821, r219830, r219831,
    r219832, r219833, r219837, r219844, r219864, r219873, r219879,
    r219882, r219884, r219887, r219900:
  
  r219351 (pjd):
  
  Allow to checksum on-the-wire data using either CRC32 or SHA256.
  
  r219354 (pjd):
  
  Allow to compress on-the-wire data using two algorithms:
  - HOLE - it simply turns all-zero blocks into few bytes header;
          it is extremely fast, so it is turned on by default;
          it is mostly intended to speed up initial synchronization
          where we expect many zeros;
  - LZF - very fast algorithm by Marc Alexander Lehmann, which shows
          very decent compression ratio and has BSD license.
  
  r219369 (pjd):
  
  Provides three states for pjdlog_initialized, so we can also tell that
  this is fist initialization ever.
  
  r219370 (pjd), r219385 (pjd):
  
  - Turn on printf extentions.
  - Load support for %T for pritning time.
  - Add support for %N for printing number in human readable form.
  - Add support for %S for printing sockaddr structure (currently only AF_INET
    family is supported, as this is all we need in HAST).
  - Disable gcc compile-time format checking as this will no longer work.
  
  r219371 (pjd):
  
  Use %S to print IP address and port number.
  
  r219372 (pjd):
  
  - Log size of data to synchronize in human readable form (using %N).
  - Log synchronization time (using %T).
  - Log synchronization speed in human readable form (using %N).
  
  r219373 (pjd):
  
  Print some of the numbers in human readable form (using %N).
  
  r219482:
  
  Make workers inherit debug level from the main process.
  
  r219620 (pjd):
  
  In command line options allow size to be specified using k/M/G/T
  suffixes.
  
  r219669 (pjd):
  
  Remove #include needed for debugging.
  
  r219721:
  
  For secondary, set 2 * HAST_KEEPALIVE seconds timeout for incoming
  connection so the worker will exit if it does not receive packets from
  the primary during this interval.
  
  Reported by:    Christian Vogt <Christian.Vogt@haw-hamburg.de>
  Tested by:      Christian Vogt <Christian.Vogt@haw-hamburg.de>
  
  r219813 (pjd):
  
  If there is any traffic on one of out descriptors, we were not checking for
  long running hooks. Fix it by not using select(2) timeout to decide if we want
  to check hooks or not.
  
  r219814 (pjd):
  
  When creating connection on behalf of primary worker, set pjdlog prefix
  to resource name and role, so that any logs related to that can be identified
  properly.
  
  r219815 (pjd):
  
  Add snprlcat() and vsnprlcat() - the functions I'm always missing.
  They work as a combination of snprintf(3) and strlcat(3) - the caller
  can append a string build based on the given format.
  
  r219816 (pjd):
  
  Use snprlcat() instead of two strlcat(3)s.
  
  r219817 (pjd):
  
  Log when we start hooks checking and when we execute a hook.
  
  r219818 (pjd), r219821 (pjd):
  
  In hast.conf we define the other node's address in 'remote' variable.
  This way we know how to connect to secondary node when we are primary.
  The same variable is used by the secondary node - it only accepts
  connections from the address stored in 'remote' variable.
  In cluster configurations it is common that each node has its individual
  IP address and there is one addtional shared IP address which is assigned
  to primary node. It seems it is possible that if the shared IP address is
  from the same network as the individual IP address it might be choosen by
  the kernel as a source address for connection with the secondary node.
  Such connection will be rejected by secondary, as it doesn't come from
  primary node individual IP.
  
  Add 'source' variable that allows to specify source IP address we want to
  bind to before connecting to the secondary node.
  
  r219821 (pjd):
  
  Forgot to commit this as a part of r219818.
  
  r219830 (pjd):
  
  Detect situation where resource internal identifier differs.
  This means that both nodes have separately managed resources that don't
  have the same data.
  
  r219831 (pjd):
  
  Be pedantic and free nvout before exiting.
  
  r219832 (pjd):
  
  Increase debug level of "Checking hooks." message.
  
  r219833 (pjd):
  
  Remove stale comment. Yes, it is valid to set role back to init.
  
  r219837 (pjd):
  
  Before handling any events on descriptors check signals so we can update
  our info about worker processes if any of them was terminated in the meantime.
  
  This fixes the problem with 'hastctl status' running from a hook called on
  split-brain:
  1. Secondary calls a hooks and terminates.
  2. Hook asks for resource status via 'hastctl status'.
  3. The main hastd handles the status request by sending it to the secondary
     worker who is already dead, but because signals weren't checked yet he
     doesn't know that and we get EPIPE.
  
  r219843 (pjd):
  
  Fix typo.
  
  r219844 (pjd):
  
  Initialize localcnt on first write. This fixes assertion when we create
  resource, set role to primary, do no writes, then sent it to secondary
  and accept connection from primary.
  
  r219864 (pjd):
  
  White space cleanups.
  
  r219873 (pjd), r219873 (pjd):
  
  The proto API is a general purpose API, so don't use 'hast' in structures or
  function names. It can now be used outside of HAST.
  
  r219879:
  
  For requests that are sent only to remote component use the
  error from remote.
  
  r219882:
  
  After synchronization is complete we should make primary counters be
  equal to secondary counters:
  
    primary_localcnt = secondary_remotecnt
    primary_remotecnt = secondary_localcnt
  
  Previously it was done wrong and split-brain was observed after
  primary had synchronized up-to-date data from secondary.
  
  r219887 (pjd):
  
  Add pjd copyright.
  
  r219900 (pjd):
  
  Don't create socketpair for connection forwarding between parent and secondary.
  Secondary doesn't need to connect anywhere.
  
  Approved by:	pjd (mentor)

Added:
  stable/8/sbin/hastd/crc32.c
     - copied unchanged from r219351, head/sbin/hastd/crc32.c
  stable/8/sbin/hastd/crc32.h
     - copied unchanged from r219351, head/sbin/hastd/crc32.h
  stable/8/sbin/hastd/hast_checksum.c
     - copied unchanged from r219351, head/sbin/hastd/hast_checksum.c
  stable/8/sbin/hastd/hast_checksum.h
     - copied unchanged from r219351, head/sbin/hastd/hast_checksum.h
  stable/8/sbin/hastd/hast_compression.c
     - copied unchanged from r219354, head/sbin/hastd/hast_compression.c
  stable/8/sbin/hastd/hast_compression.h
     - copied unchanged from r219354, head/sbin/hastd/hast_compression.h
  stable/8/sbin/hastd/lzf.c
     - copied unchanged from r219354, head/sbin/hastd/lzf.c
  stable/8/sbin/hastd/lzf.h
     - copied unchanged from r219354, head/sbin/hastd/lzf.h
Modified:
  stable/8/sbin/hastctl/Makefile
  stable/8/sbin/hastctl/hastctl.8
  stable/8/sbin/hastctl/hastctl.c
  stable/8/sbin/hastd/Makefile
  stable/8/sbin/hastd/activemap.c
  stable/8/sbin/hastd/control.c
  stable/8/sbin/hastd/hast.conf.5
  stable/8/sbin/hastd/hast.h
  stable/8/sbin/hastd/hast_proto.c
  stable/8/sbin/hastd/hastd.8
  stable/8/sbin/hastd/hastd.c
  stable/8/sbin/hastd/hooks.c
  stable/8/sbin/hastd/parse.y
  stable/8/sbin/hastd/pjdlog.c
  stable/8/sbin/hastd/primary.c
  stable/8/sbin/hastd/proto.c
  stable/8/sbin/hastd/proto.h
  stable/8/sbin/hastd/proto_common.c
  stable/8/sbin/hastd/proto_impl.h
  stable/8/sbin/hastd/proto_socketpair.c
  stable/8/sbin/hastd/proto_tcp4.c
  stable/8/sbin/hastd/proto_uds.c
  stable/8/sbin/hastd/secondary.c
  stable/8/sbin/hastd/subr.c
  stable/8/sbin/hastd/subr.h
  stable/8/sbin/hastd/token.l
Directory Properties:
  stable/8/sbin/hastctl/   (props changed)
  stable/8/sbin/hastd/   (props changed)

Modified: stable/8/sbin/hastctl/Makefile
==============================================================================
--- stable/8/sbin/hastctl/Makefile	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastctl/Makefile	Tue Mar 29 20:58:25 2011	(r220151)
@@ -6,18 +6,21 @@
 
 PROG=	hastctl
 SRCS=	activemap.c
+SRCS+=	crc32.c
 SRCS+=	ebuf.c
-SRCS+=	hast_proto.c hastctl.c
+SRCS+=	hast_checksum.c hast_compression.c hast_proto.c hastctl.c
+SRCS+=	lzf.c
 SRCS+=	metadata.c
 SRCS+=	nv.c
 SRCS+=	parse.y pjdlog.c
-SRCS+=	proto.c proto_common.c proto_tcp4.c proto_uds.c
+SRCS+=	proto.c proto_common.c proto_uds.c
 SRCS+=	token.l
 SRCS+=	subr.c
 SRCS+=	y.tab.h
 WARNS?=	6
 MAN=	hastctl.8
 
+NO_WFORMAT=
 CFLAGS+=-I${.CURDIR}/../hastd
 CFLAGS+=-DINET
 .if ${MK_INET6_SUPPORT} != "no"
@@ -26,8 +29,8 @@ CFLAGS+=-DINET6
 # This is needed to have WARNS > 1.
 CFLAGS+=-DYY_NO_UNPUT
 
-DPADD=	${LIBL}
-LDADD=	-ll
+DPADD=	${LIBL} ${LIBUTIL}
+LDADD=	-ll -lutil
 .if ${MK_OPENSSL} != "no"
 DPADD+=	${LIBCRYPTO}
 LDADD+=	-lcrypto

Modified: stable/8/sbin/hastctl/hastctl.8
==============================================================================
--- stable/8/sbin/hastctl/hastctl.8	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastctl/hastctl.8	Tue Mar 29 20:58:25 2011	(r220151)
@@ -27,7 +27,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd February 1, 2010
+.Dd March 13, 2011
 .Dt HASTCTL 8
 .Os
 .Sh NAME
@@ -113,6 +113,9 @@ Size of the smaller provider used as bac
 This option can be omitted if node providers have the same size on both
 sides.
 .El
+.Pp
+If size is suffixed with a k, M, G or T, it is taken as a kilobyte,
+megabyte, gigabyte or terabyte measurement respectively.
 .It Cm role
 Change role of the given resource.
 The role can be one of:

Modified: stable/8/sbin/hastctl/hastctl.c
==============================================================================
--- stable/8/sbin/hastctl/hastctl.c	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastctl/hastctl.c	Tue Mar 29 20:58:25 2011	(r220151)
@@ -40,7 +40,7 @@ __FBSDID("$FreeBSD$");
 #include <err.h>
 #include <errno.h>
 #include <fcntl.h>
-#include <inttypes.h>
+#include <libutil.h>
 #include <limits.h>
 #include <signal.h>
 #include <stdio.h>
@@ -213,8 +213,10 @@ dump_one(struct hast_resource *res)
 		return (ret);
 
 	printf("resource: %s\n", res->hr_name);
-	printf("    datasize: %ju\n", (uintmax_t)res->hr_datasize);
-	printf("    extentsize: %d\n", res->hr_extentsize);
+	printf("    datasize: %ju (%NB)\n", (uintmax_t)res->hr_datasize,
+	    (intmax_t)res->hr_datasize);
+	printf("    extentsize: %d (%NB)\n", res->hr_extentsize,
+	    (intmax_t)res->hr_extentsize);
 	printf("    keepdirty: %d\n", res->hr_keepdirty);
 	printf("    localoff: %ju\n", (uintmax_t)res->hr_localoff);
 	printf("    resuid: %ju\n", (uintmax_t)res->hr_resuid);
@@ -321,47 +323,33 @@ control_status(struct nv *nv)
 		    nv_get_string(nv, "provname%u", ii));
 		printf("  localpath: %s\n",
 		    nv_get_string(nv, "localpath%u", ii));
-		printf("  extentsize: %u\n",
-		    (unsigned int)nv_get_uint32(nv, "extentsize%u", ii));
+		printf("  extentsize: %u (%NB)\n",
+		    (unsigned int)nv_get_uint32(nv, "extentsize%u", ii),
+		    (intmax_t)nv_get_uint32(nv, "extentsize%u", ii));
 		printf("  keepdirty: %u\n",
 		    (unsigned int)nv_get_uint32(nv, "keepdirty%u", ii));
 		printf("  remoteaddr: %s\n",
 		    nv_get_string(nv, "remoteaddr%u", ii));
+		str = nv_get_string(nv, "sourceaddr%u", ii);
+		if (str != NULL)
+			printf("  sourceaddr: %s\n", str);
 		printf("  replication: %s\n",
 		    nv_get_string(nv, "replication%u", ii));
 		str = nv_get_string(nv, "status%u", ii);
 		if (str != NULL)
 			printf("  status: %s\n", str);
-		printf("  dirty: %ju bytes\n",
-		    (uintmax_t)nv_get_uint64(nv, "dirty%u", ii));
+		printf("  dirty: %ju (%NB)\n",
+		    (uintmax_t)nv_get_uint64(nv, "dirty%u", ii),
+		    (intmax_t)nv_get_uint64(nv, "dirty%u", ii));
 	}
 	return (ret);
 }
 
-static int
-numfromstr(const char *str, intmax_t *nump)
-{
-	intmax_t num;
-	char *suffix;
-	int rerrno;
-
-	rerrno = errno;
-	errno = 0;
-	num = strtoimax(str, &suffix, 0);
-	if (errno == 0 && *suffix != '\0')
-		errno = EINVAL;
-	if (errno != 0)
-		return (-1);
-	*nump = num;
-	errno = rerrno;
-	return (0);
-}
-
 int
 main(int argc, char *argv[])
 {
 	struct nv *nv;
-	intmax_t mediasize, extentsize, keepdirty;
+	int64_t mediasize, extentsize, keepdirty;
 	int cmd, debug, error, ii;
 	const char *optstr;
 
@@ -403,15 +391,15 @@ main(int argc, char *argv[])
 			debug++;
 			break;
 		case 'e':
-			if (numfromstr(optarg, &extentsize) < 0)
+			if (expand_number(optarg, &extentsize) < 0)
 				err(1, "Invalid extentsize");
 			break;
 		case 'k':
-			if (numfromstr(optarg, &keepdirty) < 0)
+			if (expand_number(optarg, &keepdirty) < 0)
 				err(1, "Invalid keepdirty");
 			break;
 		case 'm':
-			if (numfromstr(optarg, &mediasize) < 0)
+			if (expand_number(optarg, &mediasize) < 0)
 				err(1, "Invalid mediasize");
 			break;
 		case 'h':
@@ -481,7 +469,7 @@ main(int argc, char *argv[])
 	}
 
 	/* Setup control connection... */
-	if (proto_client(cfg->hc_controladdr, &controlconn) < 0) {
+	if (proto_client(NULL, cfg->hc_controladdr, &controlconn) < 0) {
 		pjdlog_exit(EX_OSERR,
 		    "Unable to setup control connection to %s",
 		    cfg->hc_controladdr);

Modified: stable/8/sbin/hastd/Makefile
==============================================================================
--- stable/8/sbin/hastd/Makefile	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastd/Makefile	Tue Mar 29 20:58:25 2011	(r220151)
@@ -4,9 +4,10 @@
 
 PROG=	hastd
 SRCS=	activemap.c
-SRCS+=	control.c
+SRCS+=	control.c crc32.c
 SRCS+=	ebuf.c event.c
-SRCS+=	hast_proto.c hastd.c hooks.c
+SRCS+=	hast_checksum.c hast_compression.c hast_proto.c hastd.c hooks.c
+SRCS+=	lzf.c
 SRCS+=	metadata.c
 SRCS+=	nv.c
 SRCS+=	secondary.c
@@ -19,6 +20,8 @@ SRCS+=	y.tab.h
 WARNS?=	6
 MAN=	hastd.8 hast.conf.5
 
+NO_WFORMAT=
+CFLAGS+=-DPROTO_TCP4_DEFAULT_PORT=8457
 CFLAGS+=-I${.CURDIR}
 CFLAGS+=-DINET
 .if ${MK_INET6_SUPPORT} != "no"

Modified: stable/8/sbin/hastd/activemap.c
==============================================================================
--- stable/8/sbin/hastd/activemap.c	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastd/activemap.c	Tue Mar 29 20:58:25 2011	(r220151)
@@ -46,7 +46,7 @@ __FBSDID("$FreeBSD$");
 #define	ACTIVEMAP_MAGIC	0xac71e4
 struct activemap {
 	int		 am_magic;	/* Magic value. */
-	off_t	 	 am_mediasize;	/* Media size in bytes. */
+	off_t		 am_mediasize;	/* Media size in bytes. */
 	uint32_t	 am_extentsize;	/* Extent size in bytes,
 					   must be power of 2. */
 	uint8_t		 am_extentshift;/* 2 ^ extentbits == extentsize */

Modified: stable/8/sbin/hastd/control.c
==============================================================================
--- stable/8/sbin/hastd/control.c	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastd/control.c	Tue Mar 29 20:58:25 2011	(r220151)
@@ -43,6 +43,8 @@ __FBSDID("$FreeBSD$");
 
 #include "hast.h"
 #include "hastd.h"
+#include "hast_checksum.h"
+#include "hast_compression.h"
 #include "hast_proto.h"
 #include "hooks.h"
 #include "nv.h"
@@ -232,6 +234,8 @@ control_status(struct hastd_config *cfg,
 	nv_add_string(nvout, res->hr_provname, "provname%u", no);
 	nv_add_string(nvout, res->hr_localpath, "localpath%u", no);
 	nv_add_string(nvout, res->hr_remoteaddr, "remoteaddr%u", no);
+	if (res->hr_sourceaddr[0] != '\0')
+		nv_add_string(nvout, res->hr_sourceaddr, "sourceaddr%u", no);
 	switch (res->hr_replication) {
 	case HAST_REPLICATION_FULLSYNC:
 		nv_add_string(nvout, "fullsync", "replication%u", no);
@@ -246,6 +250,10 @@ control_status(struct hastd_config *cfg,
 		nv_add_string(nvout, "unknown", "replication%u", no);
 		break;
 	}
+	nv_add_string(nvout, checksum_name(res->hr_checksum),
+	    "checksum%u", no);
+	nv_add_string(nvout, compression_name(res->hr_compression),
+	    "compression%u", no);
 	nv_add_string(nvout, role2str(res->hr_role), "role%u", no);
 
 	switch (res->hr_role) {
@@ -319,7 +327,7 @@ control_handle(struct hastd_config *cfg)
 	if (cmd == HASTCTL_SET_ROLE) {
 		role = nv_get_uint8(nvin, "role");
 		switch (role) {
-		case HAST_ROLE_INIT:	/* Is that valid to set, hmm? */
+		case HAST_ROLE_INIT:
 		case HAST_ROLE_PRIMARY:
 		case HAST_ROLE_SECONDARY:
 			break;

Copied: stable/8/sbin/hastd/crc32.c (from r219351, head/sbin/hastd/crc32.c)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/8/sbin/hastd/crc32.c	Tue Mar 29 20:58:25 2011	(r220151, copy of r219351, head/sbin/hastd/crc32.c)
@@ -0,0 +1,115 @@
+/*-
+ *  COPYRIGHT (C) 1986 Gary S. Brown.  You may use this program, or
+ *  code or tables extracted from it, as desired without restriction.
+ */
+
+/*
+ *  First, the polynomial itself and its table of feedback terms.  The
+ *  polynomial is
+ *  X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X^1+X^0
+ *
+ *  Note that we take it "backwards" and put the highest-order term in
+ *  the lowest-order bit.  The X^32 term is "implied"; the LSB is the
+ *  X^31 term, etc.  The X^0 term (usually shown as "+1") results in
+ *  the MSB being 1
+ *
+ *  Note that the usual hardware shift register implementation, which
+ *  is what we're using (we're merely optimizing it by doing eight-bit
+ *  chunks at a time) shifts bits into the lowest-order term.  In our
+ *  implementation, that means shifting towards the right.  Why do we
+ *  do it this way?  Because the calculated CRC must be transmitted in
+ *  order from highest-order term to lowest-order term.  UARTs transmit
+ *  characters in order from LSB to MSB.  By storing the CRC this way
+ *  we hand it to the UART in the order low-byte to high-byte; the UART
+ *  sends each low-bit to hight-bit; and the result is transmission bit
+ *  by bit from highest- to lowest-order term without requiring any bit
+ *  shuffling on our part.  Reception works similarly
+ *
+ *  The feedback terms table consists of 256, 32-bit entries.  Notes
+ *
+ *      The table can be generated at runtime if desired; code to do so
+ *      is shown later.  It might not be obvious, but the feedback
+ *      terms simply represent the results of eight shift/xor opera
+ *      tions for all combinations of data and CRC register values
+ *
+ *      The values must be right-shifted by eight bits by the "updcrc
+ *      logic; the shift must be unsigned (bring in zeroes).  On some
+ *      hardware you could probably optimize the shift in assembler by
+ *      using byte-swap instructions
+ *      polynomial $edb88320
+ *
+ *
+ * CRC32 code derived from work by Gary S. Brown.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <stdint.h>
+
+#include <crc32.h>
+
+uint32_t crc32_tab[] = {
+	0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, 0x076dc419, 0x706af48f,
+	0xe963a535, 0x9e6495a3,	0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988,
+	0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, 0x1db71064, 0x6ab020f2,
+	0xf3b97148, 0x84be41de,	0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7,
+	0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec,	0x14015c4f, 0x63066cd9,
+	0xfa0f3d63, 0x8d080df5,	0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172,
+	0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b,	0x35b5a8fa, 0x42b2986c,
+	0xdbbbc9d6, 0xacbcf940,	0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59,
+	0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, 0x21b4f4b5, 0x56b3c423,
+	0xcfba9599, 0xb8bda50f, 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924,
+	0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d,	0x76dc4190, 0x01db7106,
+	0x98d220bc, 0xefd5102a, 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433,
+	0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, 0x7f6a0dbb, 0x086d3d2d,
+	0x91646c97, 0xe6635c01, 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e,
+	0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, 0x65b0d9c6, 0x12b7e950,
+	0x8bbeb8ea, 0xfcb9887c, 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65,
+	0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, 0x4adfa541, 0x3dd895d7,
+	0xa4d1c46d, 0xd3d6f4fb, 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0,
+	0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, 0x5005713c, 0x270241aa,
+	0xbe0b1010, 0xc90c2086, 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f,
+	0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, 0x59b33d17, 0x2eb40d81,
+	0xb7bd5c3b, 0xc0ba6cad, 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a,
+	0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, 0xe3630b12, 0x94643b84,
+	0x0d6d6a3e, 0x7a6a5aa8, 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1,
+	0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, 0xf762575d, 0x806567cb,
+	0x196c3671, 0x6e6b06e7, 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc,
+	0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, 0xd6d6a3e8, 0xa1d1937e,
+	0x38d8c2c4, 0x4fdff252, 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b,
+	0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, 0xdf60efc3, 0xa867df55,
+	0x316e8eef, 0x4669be79, 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236,
+	0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, 0xc5ba3bbe, 0xb2bd0b28,
+	0x2bb45a92, 0x5cb36a04, 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d,
+	0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, 0x9c0906a9, 0xeb0e363f,
+	0x72076785, 0x05005713, 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38,
+	0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, 0x86d3d2d4, 0xf1d4e242,
+	0x68ddb3f8, 0x1fda836e, 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777,
+	0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, 0x8f659eff, 0xf862ae69,
+	0x616bffd3, 0x166ccf45, 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2,
+	0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, 0xaed16a4a, 0xd9d65adc,
+	0x40df0b66, 0x37d83bf0, 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9,
+	0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, 0xbad03605, 0xcdd70693,
+	0x54de5729, 0x23d967bf, 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94,
+	0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d
+};
+
+/*
+ * A function that calculates the CRC-32 based on the table above is
+ * given below for documentation purposes. An equivalent implementation
+ * of this function that's actually used in the kernel can be found
+ * in sys/libkern.h, where it can be inlined.
+ *
+ *	uint32_t
+ *	crc32(const void *buf, size_t size)
+ *	{
+ *		const uint8_t *p = buf;
+ *		uint32_t crc;
+ *
+ *		crc = ~0U;
+ *		while (size--)
+ *			crc = crc32_tab[(crc ^ *p++) & 0xFF] ^ (crc >> 8);
+ *		return crc ^ ~0U;
+ *	}
+ */

Copied: stable/8/sbin/hastd/crc32.h (from r219351, head/sbin/hastd/crc32.h)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/8/sbin/hastd/crc32.h	Tue Mar 29 20:58:25 2011	(r220151, copy of r219351, head/sbin/hastd/crc32.h)
@@ -0,0 +1,28 @@
+/*-
+ *  COPYRIGHT (C) 1986 Gary S. Brown.  You may use this program, or
+ *  code or tables extracted from it, as desired without restriction.
+ *
+ * $FreeBSD$
+ */
+
+#ifndef _CRC32_H_
+#define	_CRC32_H_
+
+#include <stdint.h>	/* uint32_t */
+#include <stdlib.h>	/* size_t */
+
+extern uint32_t crc32_tab[];
+
+static __inline uint32_t
+crc32(const void *buf, size_t size)
+{
+	const uint8_t *p = buf;
+	uint32_t crc;
+
+	crc = ~0U;
+	while (size--)
+		crc = crc32_tab[(crc ^ *p++) & 0xFF] ^ (crc >> 8);
+	return (crc ^ ~0U);
+}
+
+#endif	/* !_CRC32_H_ */

Modified: stable/8/sbin/hastd/hast.conf.5
==============================================================================
--- stable/8/sbin/hastd/hast.conf.5	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastd/hast.conf.5	Tue Mar 29 20:58:25 2011	(r220151)
@@ -1,5 +1,5 @@
 .\" Copyright (c) 2010 The FreeBSD Foundation
-.\" Copyright (c) 2010 Pawel Jakub Dawidek <pjd@FreeBSD.org>
+.\" Copyright (c) 2010-2011 Pawel Jakub Dawidek <pawel@dawidek.net>
 .\" All rights reserved.
 .\"
 .\" This software was developed by Pawel Jakub Dawidek under sponsorship from
@@ -28,7 +28,7 @@
 .\"
 .\" $FreeBSD$
 .\"
-.Dd August 30, 2010
+.Dd March 20, 2011
 .Dt HAST.CONF 5
 .Os
 .Sh NAME
@@ -59,6 +59,8 @@ file is following:
 control <addr>
 listen <addr>
 replication <mode>
+checksum <algorithm>
+compression <algorithm>
 timeout <seconds>
 exec <path>
 
@@ -77,6 +79,8 @@ on <node> {
 resource <name> {
 	# Resource section
 	replication <mode>
+	checksum <algorithm>
+	compression <algorithm>
 	name <name>
 	local <path>
 	timeout <seconds>
@@ -89,6 +93,7 @@ resource <name> {
 		local <path>
 		# Required
 		remote <addr>
+		source <addr>
 	}
 	on <node> {
 		# Resource-node section
@@ -97,6 +102,7 @@ resource <name> {
 		local <path>
 		# Required
 		remote <addr>
+		source <addr>
 	}
 }
 .Ed
@@ -201,6 +207,36 @@ The
 .Ic async
 replication mode is currently not implemented.
 .El
+.It Ic checksum Aq algorithm
+.Pp
+Checksum algorithm should be one of the following:
+.Bl -tag -width ".Ic sha256"
+.It Ic none
+No checksum will be calculated for the data being send over the network.
+This is the default setting.
+.It Ic crc32
+CRC32 checksum will be calculated.
+.It Ic sha256
+SHA256 checksum will be calculated.
+.El
+.It Ic compression Aq algorithm
+.Pp
+Compression algorithm should be one of the following:
+.Bl -tag -width ".Ic none"
+.It Ic none
+Data send over the network will not be compressed.
+.It Ic hole
+Only blocks that contain all zeros will be compressed.
+This is very useful for initial synchronization where potentially many blocks
+are still all zeros.
+There should be no measurable performance overhead when this algorithm is being
+used.
+This is the default setting.
+.It Ic lzf
+The LZF algorithm by Marc Alexander Lehmann will be used to compress the data
+send over the network.
+LZF is very fast, general purpose compression algorithm.
+.El
 .It Ic timeout Aq seconds
 .Pp
 Connection timeout in seconds.
@@ -303,6 +339,14 @@ A special value of
 .Va none
 can be used when the remote address is not yet known (eg. the other node is not
 set up yet).
+.It Ic source Aq addr
+.Pp
+Local address to bind to before connecting to the remote
+.Nm hastd
+daemon.
+Format is the same as for the
+.Ic listen
+statement.
 .El
 .Sh FILES
 .Bl -tag -width ".Pa /var/run/hastctl" -compact
@@ -333,10 +377,12 @@ resource shared {
 resource tank {
 	on hasta {
 		local /dev/mirror/tanka
+		source tcp4://10.0.0.1
 		remote tcp4://10.0.0.2
 	}
 	on hastb {
 		local /dev/mirror/tankb
+		source tcp4://10.0.0.2
 		remote tcp4://10.0.0.1
 	}
 }

Modified: stable/8/sbin/hastd/hast.h
==============================================================================
--- stable/8/sbin/hastd/hast.h	Tue Mar 29 20:53:51 2011	(r220150)
+++ stable/8/sbin/hastd/hast.h	Tue Mar 29 20:58:25 2011	(r220151)
@@ -1,5 +1,6 @@
 /*-
  * Copyright (c) 2009-2010 The FreeBSD Foundation
+ * Copyright (c) 2011 Pawel Jakub Dawidek <pawel@dawidek.net>
  * All rights reserved.
  *
  * This software was developed by Pawel Jakub Dawidek under sponsorship from
@@ -85,7 +86,6 @@
 #define	HAST_TIMEOUT	5
 #define	HAST_CONFIG	"/etc/hast.conf"
 #define	HAST_CONTROL	"/var/run/hastctl"
-#define	HASTD_PORT	8457
 #define	HASTD_LISTEN	"tcp4://0.0.0.0:8457"
 #define	HASTD_PIDFILE	"/var/run/hastd.pid"
 
@@ -97,6 +97,9 @@
 #define	HAST_ADDRSIZE	1024
 #define	HAST_TOKEN_SIZE	16
 
+/* Number of seconds to sleep between reconnect retries or keepalive packets. */
+#define	HAST_KEEPALIVE	10
+
 struct hastd_config {
 	/* Address to communicate with hastctl(8). */
 	char	 hc_controladdr[HAST_ADDRSIZE];
@@ -116,6 +119,14 @@ struct hastd_config {
 #define	HAST_REPLICATION_MEMSYNC	1
 #define	HAST_REPLICATION_ASYNC		2
 
+#define	HAST_COMPRESSION_NONE	0
+#define	HAST_COMPRESSION_HOLE	1
+#define	HAST_COMPRESSION_LZF	2
+
+#define	HAST_CHECKSUM_NONE	0
+#define	HAST_CHECKSUM_CRC32	1
+#define	HAST_CHECKSUM_SHA256	2
+
 /*
  * Structure that describes single resource.
  */
@@ -132,6 +143,10 @@ struct hast_resource {
 	int	hr_keepdirty;
 	/* Path to a program to execute on various events. */
 	char	hr_exec[PATH_MAX];
+	/* Compression algorithm. */
+	int	hr_compression;
+	/* Checksum algorithm. */
+	int	hr_checksum;
 
 	/* Path to local component. */
 	char	hr_localpath[PATH_MAX];
@@ -153,6 +168,8 @@ struct hast_resource {
 
 	/* Address of the remote component. */
 	char	hr_remoteaddr[HAST_ADDRSIZE];
+	/* Local address to bind to for outgoing connections. */
+	char	hr_sourceaddr[HAST_ADDRSIZE];
 	/* Connection for incoming data. */
 	struct proto_conn *hr_remotein;
 	/* Connection for outgoing data. */

Copied: stable/8/sbin/hastd/hast_checksum.c (from r219351, head/sbin/hastd/hast_checksum.c)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/8/sbin/hastd/hast_checksum.c	Tue Mar 29 20:58:25 2011	(r220151, copy of r219351, head/sbin/hastd/hast_checksum.c)
@@ -0,0 +1,169 @@
+/*-
+ * Copyright (c) 2011 Pawel Jakub Dawidek <pawel@dawidek.net>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <errno.h>
+#include <string.h>
+#include <strings.h>
+
+#ifdef HAVE_CRYPTO
+#include <openssl/sha.h>
+#endif
+
+#include <crc32.h>
+#include <hast.h>
+#include <nv.h>
+#include <pjdlog.h>
+
+#include "hast_checksum.h"
+
+#ifdef HAVE_CRYPTO
+#define	MAX_HASH_SIZE	SHA256_DIGEST_LENGTH
+#else
+#define	MAX_HASH_SIZE	4
+#endif
+
+static int
+hast_crc32_checksum(const unsigned char *data, size_t size,
+    unsigned char *hash, size_t *hsizep)
+{
+	uint32_t crc;
+
+	crc = crc32(data, size);
+	/* XXXPJD: Do we have to use htole32() on crc first? */
+	bcopy(&crc, hash, sizeof(crc));
+	*hsizep = sizeof(crc);
+
+	return (0);
+}
+
+#ifdef HAVE_CRYPTO
+static int
+hast_sha256_checksum(const unsigned char *data, size_t size,
+    unsigned char *hash, size_t *hsizep)
+{
+	SHA256_CTX ctx;
+
+	SHA256_Init(&ctx);
+	SHA256_Update(&ctx, data, size);
+	SHA256_Final(hash, &ctx);
+	*hsizep = SHA256_DIGEST_LENGTH;
+
+	return (0);
+}
+#endif	/* HAVE_CRYPTO */
+
+const char *
+checksum_name(int num)
+{
+
+	switch (num) {
+	case HAST_CHECKSUM_NONE:
+		return ("none");
+	case HAST_CHECKSUM_CRC32:
+		return ("crc32");
+	case HAST_CHECKSUM_SHA256:
+		return ("sha256");
+	}
+	return ("unknown");
+}
+
+int
+checksum_send(const struct hast_resource *res, struct nv *nv, void **datap,
+    size_t *sizep, bool *freedatap __unused)
+{
+	unsigned char hash[MAX_HASH_SIZE];
+	size_t hsize;
+	int ret;
+
+	switch (res->hr_checksum) {
+	case HAST_CHECKSUM_NONE:
+		return (0);
+	case HAST_CHECKSUM_CRC32:
+		ret = hast_crc32_checksum(*datap, *sizep, hash, &hsize);
+		break;
+#ifdef HAVE_CRYPTO
+	case HAST_CHECKSUM_SHA256:
+		ret = hast_sha256_checksum(*datap, *sizep, hash, &hsize);
+		break;
+#endif
+	default:
+		PJDLOG_ABORT("Invalid checksum: %d.", res->hr_checksum);
+	}
+
+	if (ret != 0)
+		return (ret);
+	nv_add_string(nv, checksum_name(res->hr_checksum), "checksum");
+	nv_add_uint8_array(nv, hash, hsize, "hash");
+	if (nv_error(nv) != 0) {
+		errno = nv_error(nv);
+		return (-1);
+	}
+	return (0);
+}
+
+int
+checksum_recv(const struct hast_resource *res __unused, struct nv *nv,
+    void **datap, size_t *sizep, bool *freedatap __unused)
+{
+	unsigned char chash[MAX_HASH_SIZE];
+	const unsigned char *rhash;
+	size_t chsize, rhsize;
+	const char *algo;
+	int ret;
+
+	algo = nv_get_string(nv, "checksum");
+	if (algo == NULL)
+		return (0);	/* No checksum. */
+	rhash = nv_get_uint8_array(nv, &rhsize, "hash");
+	if (rhash == NULL) {
+		pjdlog_error("Hash is missing.");
+		return (-1);	/* Hash not found. */
+	}
+	if (strcmp(algo, "crc32") == 0)
+		ret = hast_crc32_checksum(*datap, *sizep, chash, &chsize);
+#ifdef HAVE_CRYPTO
+	else if (strcmp(algo, "sha256") == 0)
+		ret = hast_sha256_checksum(*datap, *sizep, chash, &chsize);
+#endif
+	else {
+		pjdlog_error("Unknown checksum algorithm '%s'.", algo);
+		return (-1);	/* Unknown checksum algorithm. */
+	}
+	if (rhsize != chsize) {
+		pjdlog_error("Invalid hash size (%zu) for %s, should be %zu.",
+		    rhsize, algo, chsize);
+		return (-1);	/* Different hash size. */
+	}
+	if (bcmp(rhash, chash, chsize) != 0) {
+		pjdlog_error("Hash mismatch.");
+		return (-1);	/* Hash mismatch. */
+	}
+
+	return (0);
+}

Copied: stable/8/sbin/hastd/hast_checksum.h (from r219351, head/sbin/hastd/hast_checksum.h)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/8/sbin/hastd/hast_checksum.h	Tue Mar 29 20:58:25 2011	(r220151, copy of r219351, head/sbin/hastd/hast_checksum.h)
@@ -0,0 +1,44 @@
+/*-
+ * Copyright (c) 2011 Pawel Jakub Dawidek <pawel@dawidek.net>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ *
+ * $FreeBSD$
+ */
+
+#ifndef	_HAST_CHECKSUM_H_
+#define	_HAST_CHECKSUM_H_
+
+#include <stdlib.h>	/* size_t */
+
+#include <hast.h>
+#include <nv.h>
+
+const char *checksum_name(int num);
+
+int checksum_send(const struct hast_resource *res, struct nv *nv,
+    void **datap, size_t *sizep, bool *freedatap);
+int checksum_recv(const struct hast_resource *res, struct nv *nv,
+    void **datap, size_t *sizep, bool *freedatap);
+
+#endif	/* !_HAST_CHECKSUM_H_ */

Copied: stable/8/sbin/hastd/hast_compression.c (from r219354, head/sbin/hastd/hast_compression.c)
==============================================================================
--- /dev/null	00:00:00 1970	(empty, because file is newly added)
+++ stable/8/sbin/hastd/hast_compression.c	Tue Mar 29 20:58:25 2011	(r220151, copy of r219354, head/sbin/hastd/hast_compression.c)
@@ -0,0 +1,283 @@
+/*-
+ * Copyright (c) 2011 Pawel Jakub Dawidek <pawel@dawidek.net>
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <sys/cdefs.h>
+__FBSDID("$FreeBSD$");
+
+#include <sys/endian.h>
+
+#include <errno.h>
+#include <string.h>
+#include <strings.h>
+
+#include <hast.h>
+#include <lzf.h>
+#include <nv.h>
+#include <pjdlog.h>
+
+#include "hast_compression.h"
+
+static bool
+allzeros(const void *data, size_t size)
+{
+	const uint64_t *p = data;
+	unsigned int i;
+	uint64_t v;
+
+	PJDLOG_ASSERT((size % sizeof(*p)) == 0);
+
+	/*
+	 * This is the fastest method I found for checking if the given
+	 * buffer contain all zeros.
+	 * Because inside the loop we don't check at every step, we would
+	 * get an answer only after walking through entire buffer.
+	 * To return early if the buffer doesn't contain all zeros, we probe
+	 * 8 bytes at the begining, in the middle and at the end of the buffer
+	 * first.
+	 */
+
+	size >>= 3;	/* divide by 8 */
+	if ((p[0] | p[size >> 1] | p[size - 1]) != 0)
+		return (false);
+	v = 0;
+	for (i = 0; i < size; i++)
+		v |= *p++;
+	return (v == 0);
+}
+
+static void *
+hast_hole_compress(const unsigned char *data, size_t *sizep)
+{
+	uint32_t size;
+	void *newbuf;
+
+	if (!allzeros(data, *sizep))
+		return (NULL);
+
+	newbuf = malloc(sizeof(size));
+	if (newbuf == NULL) {
+		pjdlog_warning("Unable to compress (no memory: %zu).",
+		    (size_t)*sizep);
+		return (NULL);
+	}
+	size = htole32((uint32_t)*sizep);
+	bcopy(&size, newbuf, sizeof(size));
+	*sizep = sizeof(size);
+
+	return (newbuf);
+}
+
+static void *
+hast_hole_decompress(const unsigned char *data, size_t *sizep)
+{
+	uint32_t size;
+	void *newbuf;
+
+	if (*sizep != sizeof(size)) {
+		pjdlog_error("Unable to decompress (invalid size: %zu).",
+		    *sizep);
+		return (NULL);
+	}
+
+	bcopy(data, &size, sizeof(size));
+	size = le32toh(size);
+
+	newbuf = malloc(size);
+	if (newbuf == NULL) {
+		pjdlog_error("Unable to decompress (no memory: %zu).",
+		    (size_t)size);
+		return (NULL);
+	}
+	bzero(newbuf, size);
+	*sizep = size;
+
+	return (newbuf);
+}
+
+/* Minimum block size to try to compress. */
+#define	HAST_LZF_COMPRESS_MIN	1024
+
+static void *
+hast_lzf_compress(const unsigned char *data, size_t *sizep)
+{
+	unsigned char *newbuf;
+	uint32_t origsize;
+	size_t newsize;
+
+	origsize = *sizep;
+
+	if (origsize <= HAST_LZF_COMPRESS_MIN)
+		return (NULL);
+
+	newsize = sizeof(origsize) + origsize - HAST_LZF_COMPRESS_MIN;
+	newbuf = malloc(newsize);
+	if (newbuf == NULL) {
+		pjdlog_warning("Unable to compress (no memory: %zu).",
+		    newsize);
+		return (NULL);
+	}
+	newsize = lzf_compress(data, *sizep, newbuf + sizeof(origsize),
+	    newsize - sizeof(origsize));
+	if (newsize == 0) {
+		free(newbuf);
+		return (NULL);
+	}
+	origsize = htole32(origsize);
+	bcopy(&origsize, newbuf, sizeof(origsize));
+
+	*sizep = sizeof(origsize) + newsize;
+	return (newbuf);
+}
+
+static void *
+hast_lzf_decompress(const unsigned char *data, size_t *sizep)
+{
+	unsigned char *newbuf;
+	uint32_t origsize;
+	size_t newsize;
+
+	PJDLOG_ASSERT(*sizep > sizeof(origsize));
+
+	bcopy(data, &origsize, sizeof(origsize));
+	origsize = le32toh(origsize);
+	PJDLOG_ASSERT(origsize > HAST_LZF_COMPRESS_MIN);
+
+	newbuf = malloc(origsize);
+	if (newbuf == NULL) {
+		pjdlog_error("Unable to decompress (no memory: %zu).",
+		    (size_t)origsize);
+		return (NULL);
+	}
+	newsize = lzf_decompress(data + sizeof(origsize),
+	    *sizep - sizeof(origsize), newbuf, origsize);
+	if (newsize == 0) {
+		free(newbuf);
+		pjdlog_error("Unable to decompress.");
+		return (NULL);
+	}
+	PJDLOG_ASSERT(newsize == origsize);
+
+	*sizep = newsize;
+	return (newbuf);
+}
+
+const char *
+compression_name(int num)
+{
+
+	switch (num) {
+	case HAST_COMPRESSION_NONE:
+		return ("none");
+	case HAST_COMPRESSION_HOLE:
+		return ("hole");
+	case HAST_COMPRESSION_LZF:
+		return ("lzf");
+	}
+	return ("unknown");
+}
+
+int
+compression_send(const struct hast_resource *res, struct nv *nv, void **datap,
+    size_t *sizep, bool *freedatap)
+{
+	unsigned char *newbuf;
+	int compression;

*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201103292058.p2TKwPL6040796>