Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 15 Jul 2016 13:45:57 -0600
From:      John Nielsen <lists@jnielsen.net>
To:        freebsd-fs <freebsd-fs@freebsd.org>
Cc:        Jordan Hubbard <jkh@ixsystems.com>, Willem Jan Withagen <wjw@digiware.nl>
Subject:   Re: pNFS server Plan B
Message-ID:  <E0FA996E-AAE9-421F-8D89-28E67B42B70A@jnielsen.net>
In-Reply-To: <20e89f76-867f-67b7-bb80-17acf8de6ed3@digiware.nl>
References:  <1524639039.147096032.1465856925174.JavaMail.zimbra@uoguelph.ca> <D20C793E-A2FD-49F3-AD88-7C2FED5E7715@ixsystems.com> <16d38847-f515-f532-1300-d2843005999e@digiware.nl> <0CB465F9-B819-4DA7-969C-690A02BEB66E@ixsystems.com> <20e89f76-867f-67b7-bb80-17acf8de6ed3@digiware.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
Sorry for the very delayed reply but I've been behind on list emails for =
a while. I've enjoyed reading the whole thread and have a few comments =
below. I'm interested in a lot of this stuff as both a consumer and an =
enterprise sysadmin (but I'm not much of a developer). Hopefully I can =
provide some of the perspective Rick was looking for.

> On Jun 24, 2016, at 2:21 AM, Willem Jan Withagen <wjw@digiware.nl> =
wrote:
>=20
> On 24-6-2016 09:35, Jordan Hubbard wrote:
>>=20
>>> On Jun 22, 2016, at 1:56 AM, Willem Jan Withagen <wjw@digiware.nl>
>>> wrote:
>>>=20
>>> In the spare time I have left, I'm trying to get a lot of small
>>> fixes into the ceph tree to get it actually compiling, testing, and
>>> running on FreeBSD. But Ceph is a lot of code, and since a lot of
>>> people are working on it, the number of code changes are big.
>>=20
>> Hi Willem,
>>=20
>> Yes, I read your paper on the porting effort!

Indeed, thank you again. I've been wanting to test your patches but =
haven't had time; hopefully that will change soon.

>> I also took a look at porting ceph myself, about 2 years ago, and
>> rapidly concluded that it wasn=E2=80=99t a small / trivial effort by =
any
>> means and would require a strong justification in terms of ceph=E2=80=99=
s
>> feature set over glusterfs / moose / OpenAFS / RiakCS / etc.   Since
>> that time, there=E2=80=99s been customer interest but nothing truly =
=E2=80=9Cstrong=E2=80=9D
>> per-se. =20
>=20
> I've been going at it since last November... And all I go in are about =
3
> batches of FreeBSD specific commits. Lots has to do with release =
windows
> and code slush, like we know on FreeBSD. But then still reviews tend =
to
> slow and I need people to push to look at them. Whilst in the mean =
time
> all kinds of thing get pulled and inserted in the tree, that seriously
> are not FreeBSD. Sometimes I see them during commit, and "negotiate"
> better comparability with the author. At other times I missed the =
whole
> thing, and I need to rebase to get ride of merge conflicts. To find =
out
> the hard way that somebody has made the whole
> peer communication async. And has thrown kqueue for the BSDs at it. =
But
> they don't work (yet). So to get my other patches in, if First need to
> fix this. Takes a lot of time .....
>=20
> That all said I was in Geneva and a lot of the Ceph people were there
> including Sage Weil. And I go the feeling they appreciated a larger
> community. I think they see what ZFS has done with OpenZFS and see =
that
> communities get somewhere.

I think too that you're probably wearing them down. :)

> Now on of the things to do to continue, now that I sort of can compile
> and run the first testset, is set up sort of my own Jenkins stuff. So
> that I can at least test drive some of the tree automagically to get
> some testcoverage of the code on FreeBSD. In my mind (and Sage warned =
me
> that that will be more or less required) it is the only way to =
actually
> get a serious foot in the door with the Ceph guys.
>=20
>> My attraction to ceph remains centered around at least these
>> 4 things:
>>=20
>> 1. Distributed Object store with S3-compatible ReST API=20
>> 2. Interoperates with Openstack via Swift compatibility=20
>> 3. Block storage > (RADOS) - possibly useful for iSCSI and other =
block storage
>> requirements=20
>> 4. Filesystem interface

I will admit I don't have a lot of experience with other things like =
GlusterFS, but for me Ceph is very compelling for similar reasons:

1. Block storage (RADOS Block Device). This is the top of my list since =
it makes it easy to run a resilient farm of hypervisors that supports =
live migration _without_ NFS, iSCSI or anything else. For small =
deployments (like I have at home), you can run Ceph and the hypervisors =
on the same hardware and still reboot them one at a time without any =
storage interruption or having to stop any VMs (just shuffle them =
around). No NAS/SAN required at all. Another similar use case (which =
just got easier on Linux at least with the release of rbd-nbd support) =
is (Docker) containers with persistent data volumes not being tied to =
any specific host. I would _love_ to see librbd support in Bhyve but =
obviously a working librbd on FreeBSD is a prerequisite for that.

2. Distributed object store with S3 and Swift compatibility. A lot of =
different enterprises need this for a lot of different reasons. I know =
for a fact that some of the pricey commercial offerings use Ceph under =
the covers. For shops where budget is more important than commercial =
support this is a great option.

3. Everything else, including but not limited to native object store =
(RADOS), POSIX filesystem (which as mentioned is now advertised as =
production-quality with experimental support for multiple metadata =
servers, support for arbitrary topologies, custom CRUSH maps, erasure =
coding for efficient replication, ...

I do think Ceph on ZFS would be fantastic (and actually have a Fedora =
box with a ZFSonLinux-backed OSD). Not sure if BlueStore will be a good =
thing or not (even ignoring the porting hurdles, which are unfortunate). =
It would be interesting to compare features and performance of a ZFS OSD =
and a ZVOL-backed BlueStore OSD.

>> Is there anything we can do to help? =20
>=20
> I'll get back on that in a separate Email.

With my $work hat on, I'd be interested in a TrueNAS S3 appliance that =
came with support.

Anyway, glad to see that both pNFS and Ceph on FreeBSD are potentially =
in the works.

JN




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E0FA996E-AAE9-421F-8D89-28E67B42B70A>