From owner-freebsd-fs@freebsd.org Thu Jul 14 15:50:39 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1C63CB982AF for ; Thu, 14 Jul 2016 15:50:39 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: from mail-wm0-x22c.google.com (mail-wm0-x22c.google.com [IPv6:2a00:1450:400c:c09::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B26A31E32 for ; Thu, 14 Jul 2016 15:50:38 +0000 (UTC) (envelope-from ben.rubson@gmail.com) Received: by mail-wm0-x22c.google.com with SMTP id o80so118645920wme.1 for ; Thu, 14 Jul 2016 08:50:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=obV56+9+YeNv1x5FcQElTxIJ55QKJJh7ZyGKfr4p5Mk=; b=Vwo7zdtJ9P1faZLPBy4AlZIfPIy4cQOVeakFZv0n3chg/m0MfRKseB+EHCiKVg000G 5XHUgOORXicgkDZeoofZfnRALtt7HJ8KqI3+F9OEEY2MWuz8hfjwMJsuq7TxHQSHKc/i buO3iLRKq+/Qy/BrHeJSA1esA12iPozYMh9IxEl6Xr1k8Izl3Xm43/BrTYvilgwT8H/V nBbWkpODUKxHFUD03wYZMpPqRdVtBvJww2v9tEh2bXkEebnDaFcvJtLOtgPZgphSFXrf HpbuyGzOQtGMT8F03P2SAmx8xzK8fM/HLKlICo9E8Vp1LpW4D9CgabcrZmr4cz+E9ubo woAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=obV56+9+YeNv1x5FcQElTxIJ55QKJJh7ZyGKfr4p5Mk=; b=AdLCpWesuZSvowpT3DF4RYD1I3lWQ+w4anJLlMGSGrRAYb+P8xkVLK5GgCWITL9/km HknoNB7CqZXH1rXz37gvnoC3eSdqXeG7KPkl65sbTi3L8MKJLQ1gziQ+4q+MbqeQ5JEJ Q28JkO9AVddRUuCrkutFvQuWZKWW1OK1p1n6PhaadQL4OF455yI3WTF6Phaaj1PaaiCd aUZR4zoXwjQpMfKKCbQxriL9HX9z+L+YcAjuKPwsjw36lMm8UjrW+V1f904Zswpq52nj DKnOZ+Z3BzR79oM0Gtru2cxe2vKg8iNZZ3jso2+vc2hzvPRoN8aaQcoFXdqVqyLjrTLb uxwA== X-Gm-Message-State: ALyK8tJHj86XWnOMH7V8Hxbj4RdHpeqkJ3XgbZfPu0/AInTRVUTIrzJr10TsLuSyyco+eQ== X-Received: by 10.28.225.4 with SMTP id y4mr17511645wmg.98.1468511436659; Thu, 14 Jul 2016 08:50:36 -0700 (PDT) Received: from macbook-air-de-benjamin-1.home (LFbn-1-7077-85.w90-116.abo.wanadoo.fr. [90.116.246.85]) by smtp.gmail.com with ESMTPSA id 17sm4545735wmf.6.2016.07.14.08.50.35 for (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 14 Jul 2016 08:50:36 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: HAST + ZFS + NFS + CARP From: Ben RUBSON In-Reply-To: <3009ac40-5a29-6f05-ced3-326c9a87c9b2@rlwinm.de> Date: Thu, 14 Jul 2016 17:50:34 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <28C56E3E-E72E-4FD0-A6BB-CE3FC4277A10@gmail.com> References: <20160630144546.GB99997@mordor.lan> <71b8da1e-acb2-9d4e-5d11-20695aa5274a@internetx.com> <20160630153747.GB5695@mordor.lan> <63C07474-BDD5-42AA-BF4A-85A0E04D3CC2@gmail.com> <678321AB-A9F7-4890-A8C7-E20DFDC69137@gmail.com> <20160630185701.GD5695@mordor.lan> <6035AB85-8E62-4F0A-9FA8-125B31A7A387@gmail.com> <20160703192945.GE41276@mordor.lan> <20160703214723.GF41276@mordor.lan> <65906F84-CFFC-40E9-8236-56AFB6BE2DE1@ixsystems.com> <61283600-A41A-4A8A-92F9-7FAFF54DD175@ixsystems.com> <3009ac40-5a29-6f05-ced3-326c9a87c9b2@rlwinm.de> To: freebsd-fs@freebsd.org X-Mailer: Apple Mail (2.3124) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 14 Jul 2016 15:50:39 -0000 > On 12 Jul 2016, at 15:15, Jan Bramkamp wrote: >=20 > On 04/07/16 19:55, Jordan Hubbard wrote: >>=20 >>> On Jul 3, 2016, at 11:05 PM, Ben RUBSON = wrote: >>>=20 >>> Of course Jordan, in this topic, we (well at least me :) make the = following assumption : >>> one iSCSI target/disk =3D one real physical disk (a SAS disk, a SSD = disk...), from a server having its own JBOD, no RAID adapter or = whatever, just what ZFS likes ! >>=20 >> I certainly wouldn=E2=80=99t make that assumption. Once you allow = iSCSI to be the back-end in any solution, end-users will avail = themselves of the flexibility to also export arbitrary or synthetic = devices (like zvols / RAID devices) as =E2=80=9Cdisks=E2=80=9D. You = can=E2=80=99t stop them from doing so, so you might as well incorporate = that scenario into your design. Even if you could somehow enforce the = 1:1 mapping of LUN to disk, iSCSI itself is still going to impose a = serialization / performance / reporting (iSCSI LUNs don=E2=80=99t report = SMART status) penalty that removes a lot of the advantages of having = direct physical access to the media, so one might also ask what you=E2=80=99= re gaining by imposing those restrictions. >=20 >=20 > How about 3way ZFS mirrors spread over three SAS JBODs with = dual-ported expanders connected to two FreeBSD servers with SAS HBAs and = a *reliable* arbiter to the disks. This could either be an external = locking server e.g. consul/etcd/zookeeper and/or SCSI reservations. If = more than two head servers are to share the disks a pair of SAS switches = should do the job. It would be nice if it could work without a third server, so one = important / interesting thing to test would be the SCSI reservations : = be sure that when the pool is imported on MASTER, SLAVE can't use the = disks anymore. (this is the case with iSCSI, when SLAVE exports its disks through CTL, = it can't import them using ZFS as CTL locks them as soon as it it = started) > If N-1 disk redundancy is enough two JBODs and 2way mirrors would work = as well. Or if we only have 2 JBODs (for whatever reason), we could (should = certainly :) use 4way mirrors so that if one JBOD dies, we're still = confident with the pool. > While you can't prevent stupid operators from blowing their feet of it = doesn't offer the same "flexibility" as iSCSI if only because you can't = conveniently hookup everything talking Ethernet offering itself als = iSCSI target. That is until someone implements a SAS target with CTL and = a suitable HBA in FreeBSD ;-). Why would you prefer a SAS target over an iSCSI target ? How would it fit ? > This kind of setup should also preserve all assumptions ZFS has = regarding disks. Yep, although AFAIR no one demonstrated ZFS suffers from iSCSI :) (devs = on #openzfs stated it does not) Anyway, this is nice SAS-only setup, which avoids an additional = protocol, a very good reason to go with it. One good reason for iSCSI is that it allows servers to be in different = racks (well there are long SAS cables) / different rooms / buildings. > I have the required spare hardware to build a two JBOD test setup [1] = and could run some tests if anyone is interested in such a setup. >=20 >=20 > [1]: Test setup >=20 > +-----------+ +-----------+ > | MASTER | | SLAVE | > | | | | > | HBA0 HBA1 | | HBA0 HBA1 | > +--+----+---+ +--+----+---+ > ^ ^ ^ ^ > | | | | > | | | +------+ > | | | | > | | +----+ | > | | | | > | +-----------+ | | > | | | | > v v v | > +--+--------+ +--+----+---+ | > | JBOD 0 | | JBOD 1 | | > +-------+---+ +-----------+ | > ^ | > | | > +-----------------------+