From owner-freebsd-fs@freebsd.org Sun Jun 19 01:50:57 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 68A02A7A044 for ; Sun, 19 Jun 2016 01:50:57 +0000 (UTC) (envelope-from jkh@ixsystems.com) Received: from barracuda.ixsystems.com (barracuda.ixsystems.com [12.229.62.30]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "*.ixsystems.com", Issuer "Go Daddy Secure Certificate Authority - G2" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4B7811B55 for ; Sun, 19 Jun 2016 01:50:56 +0000 (UTC) (envelope-from jkh@ixsystems.com) X-ASG-Debug-ID: 1466301054-08ca041142196c80001-3nHGF7 Received: from zimbra.ixsystems.com ([10.246.0.20]) by barracuda.ixsystems.com with ESMTP id k71RYlA3B161ecdL (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Sat, 18 Jun 2016 18:50:54 -0700 (PDT) X-Barracuda-Envelope-From: jkh@ixsystems.com X-Barracuda-RBL-Trusted-Forwarder: 10.246.0.20 X-ASG-Whitelist: Client Received: from localhost (localhost [127.0.0.1]) by zimbra.ixsystems.com (Postfix) with ESMTP id 4B213DD0D9E; Sat, 18 Jun 2016 18:50:54 -0700 (PDT) Received: from zimbra.ixsystems.com ([127.0.0.1]) by localhost (zimbra.ixsystems.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id HvfPflKr0gQY; Sat, 18 Jun 2016 18:50:53 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by zimbra.ixsystems.com (Postfix) with ESMTP id 477A8DD0D9D; Sat, 18 Jun 2016 18:50:53 -0700 (PDT) X-Virus-Scanned: amavisd-new at ixsystems.com Received: from zimbra.ixsystems.com ([127.0.0.1]) by localhost (zimbra.ixsystems.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 7P2zdQr-883f; Sat, 18 Jun 2016 18:50:53 -0700 (PDT) Received: from [172.20.0.10] (vpn.ixsystems.com [10.249.0.2]) by zimbra.ixsystems.com (Postfix) with ESMTPSA id AB66BDD0D6C; Sat, 18 Jun 2016 18:50:52 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) Subject: Re: pNFS server Plan B From: Jordan Hubbard X-ASG-Orig-Subj: Re: pNFS server Plan B In-Reply-To: <7E27FA25-E18F-41D3-8974-EAE1EACABF38@gmail.com> Date: Sat, 18 Jun 2016 18:50:52 -0700 Cc: Rick Macklem , freebsd-fs , Alexander Motin Content-Transfer-Encoding: quoted-printable Message-Id: References: <1524639039.147096032.1465856925174.JavaMail.zimbra@uoguelph.ca> <7E27FA25-E18F-41D3-8974-EAE1EACABF38@gmail.com> To: Chris Watson X-Mailer: Apple Mail (2.3124) X-Barracuda-Connect: UNKNOWN[10.246.0.20] X-Barracuda-Start-Time: 1466301054 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://10.246.0.26:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at ixsystems.com X-Barracuda-BRTS-Status: 1 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Jun 2016 01:50:57 -0000 > On Jun 18, 2016, at 6:14 PM, Chris Watson wrote: >=20 > Since Jordan brought up clustering, I would be interested to hear = Justin Gibbs thoughts here. I know about a year ago he was asked on an = "after hours" video chat hosted by Matt Aherns about a feature he would = really like to see and he mentioned he would really like, in a universe = filled with time and money I'm sure, to work on a native clustering = solution for FreeBSD. I don't know if he is subscribed to the list, and = I'm certainly not throwing him under the bus by bringing his name up, = but I know he has at least been thinking about this for some time and = probably has some value to add here.=20 I think we should also be careful to define our terms in such a = discussion. Specifically: 1. Are we talking about block-level clustering underneath ZFS (e.g. HAST = or ${somethingElse}) or otherwise incorporated into ZFS itself at some = low level? If you Google for =E2=80=9CHigh-availability ZFS=E2=80=9D = you will encounter things like RSF-1 or the somewhat more mysterious = Zetavault (http://www.zeta.systems/zetavault/high-availability/) but = it=E2=80=99s not entirely clear how these technologies work, they simply = claim to =E2=80=9Cscale-out ZFS=E2=80=9D or =E2=80=9Ccluster ZFS=E2=80=9D = (which can be done within ZFS or one level above and still probably pass = the Marketing Test for what people are willing to put on a web page). 2. Are we talking about clustering at a slightly higher level, in a = filesystem-agnostic fashion which still preserves filesystem semantics? 3. Are we talking about clustering for data objects, in a fashion which = does not necessarily provide filesystem semantics (a sharding database = which can store arbitrary BLOBs would qualify)? For all of the above: Are we seeking to be compatible with any other = mechanisms, or are we talking about a FreeBSD-only solution? This is why I brought up glusterfs / ceph / RiakCS in my previous = comments - when talking to the $users that Rick wants to involve in the = discussion, they rarely come to the table asking for =E2=80=9Csome or = any sort of clustering, don=E2=80=99t care which or how it works=E2=80=9D = - they ask if I can offer an S3 compatible object store with horizontal = scaling, or if they can use NFS in some clustered fashion where = there=E2=80=99s a single namespace offering petabytes of storage with = configurable redundancy such that no portion of that namespace is ever = unavailable. I=E2=80=99d be interested in what Justin had in mind when he asked Matt = about this. Being able to =E2=80=9Cattach ZFS pools to one another=E2=80=9D= in such a fashion that all clients just see One Big Pool and ZFS=E2=80=99= s own redundancy / snapshotting characteristics magically apply to the = =C3=BCberpool would be Pretty Cool, obviously, and would allow one to do = round-robin DNS for NFS such that any node could serve the same = contents, but that also sounds pretty ambitious, depending on how it=E2=80= =99s implemented. - Jordan