Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 11 Mar 2012 22:31:46 +0200
From:      Mikolaj Golub <trociny@freebsd.org>
To:        Phil Regnauld <regnauld@x0.dk>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: Issue with hast replication
Message-ID:  <861uoyvpzh.fsf@kopusha.home.net>
In-Reply-To: <20120311185457.GB1684@macbook.bluepipe.net> (Phil Regnauld's message of "Sun, 11 Mar 2012 19:54:57 %2B0100")
References:  <20120311185457.GB1684@macbook.bluepipe.net>

next in thread | previous in thread | raw e-mail | index | archive | help

On Sun, 11 Mar 2012 19:54:57 +0100 Phil Regnauld wrote:

 PR> Hi,

 PR> I've got a fairly simple setup: two hosts running 9.0-R (will upgrade to stable
 PR> if told to, but want to check here first), ZFS and HAST. HAST is configured to
 PR> run on top of zvols configured on each host, as illustrated:

 PR>       FS                          FS
 PR>    +------+                    +------+ 
 PR>    | hvol | <---- hastd -----> | hvol | 
 PR>    +------+                    +------+ 
 PR>    | zvol |                    | zvol | 
 PR>    +------+                    +------+ 
 PR>    | zfs  |                    | zfs  | 
 PR>    +------+                    +------+ 
 PR>       h1                          h2

 PR> Connection is gigabit to the same switch. No issues with large TCP
 PR> transfers such as SCP/FTP.

 PR> Config is vanilla:

 PR> # zfs create -V 10G zfs/hvol

 PR> hast.conf:

 PR> resource hvol {
 PR>         on h1 {
 PR>                 local /dev/zvol/zfs/hvol
 PR>                 remote tcp4://192.168.1.100
 PR>         }
 PR>         on h2 {
 PR>                 local /dev/zvol/zfs/hvol
 PR>                 remote tcp4://192.168.1.200
 PR>         }
 PR> }


 PR> h1 is behaving fine as primary, either with h2 turned off or in init -
 PR> but as soon as I set the role to secondary for h2, the receiver
 PR> repeatedly crashes and restarts - see the traces below.

 PR> Primary:

 PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Disconnected from tcp4://192.168.1.200.
 PR> Mar 11 02:02:30 h1 hastd[2282]: [hvol] (primary) Unable to write synchronization data: Cannot allocate memory.
 PR> Mar 11 02:02:41 h1 hastd[2282]: [hvol] (primary) Unable to send request (Cannot allocate memory): WRITE(31642091520, 131072).

31642091520 looks like rather large offset for 10Gb volume...

Just to be more confident that this is a HAST issue could you please try the
following experiment?

1) Stop hastd on h2.

2) On h1 run something like below:

  dd if=/dev/zvol/zfs/hvol bs=131072 | ssh h2 dd bs=131072 of=/dev/zvol/zfs/hvol

(copy hvol from h1 to h2 without hastd to see if it will succeed).

Note: you will need to recreate HAST provider on secondary after this.

-- 
Mikolaj Golub



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?861uoyvpzh.fsf>