From owner-freebsd-fs@freebsd.org Mon Jan 22 10:24:09 2018 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1856BEB81C0 for ; Mon, 22 Jan 2018 10:24:09 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from mail.rlwinm.de (mail.rlwinm.de [IPv6:2a01:4f8:171:f902::5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id D84DB759E0 for ; Mon, 22 Jan 2018 10:24:08 +0000 (UTC) (envelope-from crest@rlwinm.de) Received: from crest.bultmann.eu (unknown [IPv6:2a00:c380:c0d5:1:59f1:b4f2:eb63:fa5d]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.rlwinm.de (Postfix) with ESMTPSA id CA13DBC9D for ; Mon, 22 Jan 2018 10:24:05 +0000 (UTC) Subject: Re: ZFS High-Availability NAS To: freebsd-fs@freebsd.org References: <20180119151919.GF507@mordor.lan> From: Jan Bramkamp Message-ID: Date: Mon, 22 Jan 2018 11:24:05 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180119151919.GF507@mordor.lan> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 22 Jan 2018 10:24:09 -0000 On 19.01.18 16:19, Julien Cigar wrote: > Hello, > > I'm wondering if someone has already put in production something > similar to https://github.com/ewwhite/zfs-ha/wiki (with two HBAs and > two disk shelves in two chains) with ZFS, gmultipath, and NFS on top ? I tried to build something similar, but ran into problems with gmultipath. There are situtations in which gmultipath retries an I/O request ad infinitum, because it can't tell the difference between temporary errors (e.g. a link failure) and permanent errors (e.g. a head crash). It just sees that the struct bio contains an error and retries reactivating failed links if it runs out of active links. Other than that it it worked ;-).