From owner-freebsd-fs@FreeBSD.ORG Sat Apr 26 19:47:39 2014 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 3DF6FAB8 for ; Sat, 26 Apr 2014 19:47:39 +0000 (UTC) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id CE1A01DAD for ; Sat, 26 Apr 2014 19:47:38 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqAEAIgIXFODaFve/2dsb2JhbABZhCyCZcF4gSF0gk9IQwINGQJfiFSXJo8foyMXgSmMWCSDKoFKBKtqg00hgSxC X-IronPort-AV: E=Sophos;i="4.97,934,1389762000"; d="scan'208";a="118145119" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-jnhn.mail.uoguelph.ca with ESMTP; 26 Apr 2014 15:47:31 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 18757B3EEF for ; Sat, 26 Apr 2014 15:47:31 -0400 (EDT) Date: Sat, 26 Apr 2014 15:47:31 -0400 (EDT) From: Rick Macklem To: FreeBSD Filesystems Message-ID: <507714298.1684844.1398541651089.JavaMail.root@uoguelph.ca> Subject: RFC: using ceph as a backend for an NFSv4.1 pNFS server MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Apr 2014 19:47:39 -0000 Hi, The non-pNFS v4.1 server in the projects area is just about ready for head, I think. However, without pNFS, NFSv4.1 isn't all that interesting. The problem is that doing a pNFS server is a non-trivial exercise. I am now somewhat familiar with pNFS (from doing the client side), but have no expertise w.r.t. cluster file systems, etc. For those not familiar with pNFS, the basic idea is that the NFSv4.1 server becomes a metadata server (MDS) and hands out what are called layouts and devinfo, so that the client can access data server(s) (DS) to read/write the file. There are RFCs that define both block/volume (using iSCSI or similar) and object (using something called ODS2). Although I suspect there are many ways to do a pNFS server, I think that building it on top of a cluster file system may be the simplest. So, this leads me to... At a glance (just the web pages, I haven't looked at the source), it appears that ceph might be useful as a backend to a pNFS server. It has a POSIX interface (that could be used by the metadata server) as well as both object (not ODS2 I suspect) and block interfaces. The licensing appears to be LGPL, which isn't ideal, but I'd say better than GPLv3 (which is what Glustre appears to be). Does anyone have experience using ceph or some other cluster file system such that you might have some idea w.r.t. its usefulness for this? Any other comments w.r.t. this would be appreciated, including generic stuff like "we couldn't care less about pNFS" or technical details/opinions. Thanks in advance for any feedback, rick ps: I'm no where near committing to do this at this point and I do realize that even completing the ceph port to FreeBSD might be beyond my limited resources.