From owner-freebsd-fs@freebsd.org Wed Sep 9 12:48:56 2015 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D49FA017DC for ; Wed, 9 Sep 2015 12:48:56 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: from mail-ig0-x22e.google.com (mail-ig0-x22e.google.com [IPv6:2607:f8b0:4001:c05::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 40BD21074 for ; Wed, 9 Sep 2015 12:48:56 +0000 (UTC) (envelope-from m.e.sanliturk@gmail.com) Received: by igbkq10 with SMTP id kq10so98391577igb.0 for ; Wed, 09 Sep 2015 05:48:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=EMNbN7EKp2bU/tTZ8MXNszVFAU6CAFqmlbfZhAQ9Wvo=; b=i3um/rzuR+hEAn7wlT9XPjU8iajD3gQONmwoJRMVmnp8hTaW/cKQ8yhekTR2WNOfSe 2Qd2b2X0BOzPxC2fjHSeCdNaN8KEY9EFtwTViW+uqsKdz4x9yc47tVMj1u0utaaq8YdA Q4+ADMjvVDjR6S7K9mfuqaePqxFgy1mUv/T+UUV4iPJyWpqBD/CtPtIlBx8gffxdGCg4 dc2Y3c8qBUzl5qwWhZPa16pgKDz3e+OxUrX7WPDdkvlUE7w+fh1WOrZ8e86ILrkmmErA 7zAG9KmILElyvMba5GPNW+wciPzLpUbaPJOWFplZkMruiFG9HK1jznEaprsjKxVD0lbc v0KA== MIME-Version: 1.0 X-Received: by 10.50.43.170 with SMTP id x10mr50732342igl.12.1441802935350; Wed, 09 Sep 2015 05:48:55 -0700 (PDT) Received: by 10.65.15.33 with HTTP; Wed, 9 Sep 2015 05:48:55 -0700 (PDT) In-Reply-To: <488408636.4345946.1441801224232.JavaMail.zimbra@uoguelph.ca> References: <100306673.40344407.1441279047901.JavaMail.zimbra@uoguelph.ca> <1564D4FA-9BE1-4E37-8E91-F14A009D6B62@icloud.com> <838814506.1858817.1441577912291.JavaMail.zimbra@uoguelph.ca> <488408636.4345946.1441801224232.JavaMail.zimbra@uoguelph.ca> Date: Wed, 9 Sep 2015 05:48:55 -0700 Message-ID: Subject: Re: CEPH + FreeBSD From: Mehmet Erol Sanliturk To: Rick Macklem Cc: Outback Dingo , freebsd-fs@freebsd.org, Rakshith Venkatesh , Jordan Hubbard Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.20 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Sep 2015 12:48:56 -0000 On Wed, Sep 9, 2015 at 5:20 AM, Rick Macklem wrote: > Outback Dingo wrote: > > On Wed, Sep 9, 2015 at 11:31 AM, Mark Saad > wrote: > > > > > All > > > What about leofs. It's in ports has and s3 obj store and NFS suppor= t > out > > > of the box > > > > > > > > > http://www.freshports.org/databases/leofs/ > > > http://leo-project.net > > > > > > LeoFS supperts NFSv3 and does not have a lock manager.... > > > I doubt lack of a lock manager is an issue for what I want to do, since > the NFSv4.1 > metadata server (just a regular NFSv4.1 server that can give out layouts > for reading/writing > the data directly on the data servers) handles the locking. It is actuall= y > much easier to keep > track of the locking in the NFSv4.1 server and not have to worry about > locking on the > underlying cluster FS. All I intend to do with a NFSv3 server on the data > server(s) is do > Read/Write RPCs. Everything else is handled via the NFSv4.1 metadata > server. > (The original RFC required use of NFSv4.1 read/write ops on the data > servers, > but a new layout type called flex files supports NFSv3 Read/Write for th= e > data servers.) > > The key issue for me is whether or not it has a VFS interface to a POSIX > like > file system (via FUSE or ???). At a quick glance at the web page, I don't > see > any mention of this? > Why? Well, simply the fact that I am looking at extending the current > kernel based > NFSv4.1 server to support pNFS. Obviously, there are other ways a > NFSv4.1/pNFS > server can be built (userland NFS-Ganehsa that is on Linux, for > example), but > that isn't what I'm interested in doing. > > Btw, I took a quick look at MooseFS and it does seem to have this and > could be an > alternative to glusterFS. It isn't an object store and only appears > to have a > single metadata server, which might be a limitation for the long ter= m? > It sounds like MooseFS uses custom prototcol for the chunk/data > servers and I don't feel like trying to define yet another layout > type, so I > think I would need to add a partial NFSv3 server to the chunk/data > servers. > > I will be looking more closely at both glusterFS and MooseFS soon. > > If there are yet more of these cluster object stores that you think might > be worth > considering, feel free to mention them. (I thought I had looked at most o= f > them, but > hadn't noticed MooseFS, so...) > > Thanks for all the comments, rick > > http://www.xtreemfs.org/ https://github.com/xtreemfs/xtreemfs ( BSD ) Mehmet Erol Sanliturk > > > > > > > > > > > > > > > --- > > > Mark Saad | nonesuch@longcount.org > > > > > > > On Sep 6, 2015, at 6:18 PM, Rick Macklem > wrote: > > > > > > > > Jordan Hubbard wrote: > > > >> > > > >>> On Sep 3, 2015, at 4:17 AM, Rick Macklem > wrote: > > > >>> > > > >>> Slightly off topic but, btw, there is a port of GLusterFS and tho= se > > > folks > > > >>> do seem > > > >>> interested in seeing it brought "up to speed". I am not sure how > > > mature it > > > >>> is at > > > >>> this point, but it has been known to build on amd64. (I don't hav= e > an > > > amd64 > > > >>> machine, > > > >>> so I haven't gotten around to building/testing it, but I do plan = to > > > try and > > > >>> use > > > >>> it as a basis for a pNFS server, if I can figure out how to get > the FH > > > info > > > >>> out of it. > > > >>> I'm working on that;-) > > > >> > > > >> There are at least two distributed (multi-node) object stores for > > > FreeBSD > > > >> that I know of. > > > >> > > > >> One is glusterfs, for which I=E2=80=99m not even really clear on t= he status > of > > > the > > > >> ports for. I don=E2=80=99t see any glusterfs port in the master b= ranch of > > > >> https://github.com/freebsd/freebsd-ports (or > > > >> https://github.com/freebsd/freebsd-ports/tree/branches/2015Q3 for > that > > > >> matter). > > > >> > > > >> Our FreeNAS ports tree (https://github.com/freenas/ports), in > which we > > > have a > > > >> bit more latitude to add and curate our own ports, has both a > > > net/glusterfs > > > >> and sysutils/glusterfs, from separate sources (looks like we need = to > > > clean > > > >> things up) - net/glusterfs lists craig001@lerwick.hopto.org as the > > > >> MAINTAINER and is at version 3.6.2. The sysutils/glusterfs port > lists > > > >> bapt@FreeBSD.org as the MAINTAINER and is at version 20140811. > > > >> > > > >> I=E2=80=99m not really sure about the provenance since we were sim= ply > evaluating > > > >> glusterfs for awhile and may have pulled in interim versions from > those > > > >> sources, but obviously it would be best to have an official > maintainer > > > and > > > >> someone in the FreeBSD project actually curating a glusterfs port = so > > > that > > > >> all users of FreeBSD can use it. It would also be fairly key to > your > > > own > > > >> efforts, assuming you decide to pursue glusterfs as a foundation > > > technology > > > >> for pNFS. > > > >> > > > >> The other object store, which is pretty mature and is currently > leading > > > the > > > >> pack (of two :) ) for inclusion into FreeNAS is RiakCS from Basho. > > > There is > > > >> a port currently in databases/riak but it=E2=80=99s pretty out of = date at > > > version > > > >> 1.4.12 (the current version is 2.0.1, with 2.0 being a major > upgrade of > > > >> RiakCS). > > > >> > > > >> We are very interested in investigating various ways of shimming > RiakCS > > > to > > > >> NFS, using RiakCS a back-end store. Is that something you=E2=80= =99d be > > > amenable to > > > >> discussing? I=E2=80=99d be happy to send you an amd64 architectu= re > machine to > > > >> develop on. :) > > > > Hmm. From a quick look at their web page (I looked once before as > well), > > > I don't > > > > think RiakCS has what I need to do pNFS in a reasonable (for me) > amount > > > of effort. > > > > Two things that glusterFS has that I am hoping to use (and I don't > think > > > RiakCS has > > > > either of these) are: > > > > - A Fuse file system interface which allows the kernel nfsd to acce= ss > > > the store as > > > > a file system, so that it can provide the metadata services (NFS > > > without the reads/writes). > > > > - A userland NFSv3 server in each node which will allow the node to > act > > > as a data server. > > > > > > > > If I am wrong and RiakCS does support a VFS file system interface > (via > > > Fuse or ???), then > > > > please correct me. With that, it might be a reasonable alternative. > > > > I'll admit I've spent a little time looking at the glusterFS source= s > and > > > haven't yet > > > > solved the problem of how to generate the file handles I need, but > that > > > sounds trivial > > > > compared with an entire Fuse and/or VFS file system interface, I > think? > > > > > > > > In general, using a cloud object store to implement a pNFS server i= s > a > > > *mis*use of > > > > the technology, imho. I think it may be possible with glusterFS, > since > > > that technology > > > > seems to be based on a cluster file system, which is what a pNFS > server > > > can also use. > > > > > > > > I think there would be a lot of work involved in mapping a POSIX fi= le > > > system onto the > > > > Riak database and then exporting that via NFS, etc. It might also b= e > > > more practical to > > > > do this via a userland NFS service than the kernel based one > currently > > > in FreeBSD. > > > > (glusterFS is starting to use the NFS-ganesha server, but I believe > it > > > is pretty Linux specific, > > > > so I doubt it would be useful for Riak running on FreeBSD?) > > > > > > > > rick > > > > > > > >> - Jordan > > > > _______________________________________________ > > > > freebsd-fs@freebsd.org mailing list > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.or= g > " > > > _______________________________________________ > > > freebsd-fs@freebsd.org mailing list > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > > > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"