From owner-freebsd-scsi@freebsd.org Wed Mar 9 00:27:07 2016 Return-Path: Delivered-To: freebsd-scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DC0DCAC8FB6 for ; Wed, 9 Mar 2016 00:27:07 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id B9AC8933 for ; Wed, 9 Mar 2016 00:27:07 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: by mailman.ysv.freebsd.org (Postfix) id B4277AC8FB4; Wed, 9 Mar 2016 00:27:07 +0000 (UTC) Delivered-To: scsi@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id B3900AC8FB3; Wed, 9 Mar 2016 00:27:07 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id E1C0892F; Wed, 9 Mar 2016 00:27:06 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) IronPort-PHdr: 9a23:bBAKzxzyCsua54jXCy+O+j09IxM/srCxBDY+r6Qd0O4SIJqq85mqBkHD//Il1AaPBtWEraIZwLCP+4nbGkU+or+5+EgYd5JNUxJXwe43pCcHRPC/NEvgMfTxZDY7FskRHHVs/nW8LFQHUJ2mPw6anHS+4HYoFwnlMkItf6KuStGU35n8jbn60qaQSjsLrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGx48sgs/M9YUKj8Y79wDfkBVGxnYCgI4tb2v0zDUReX/SlbFWEXiQZTRQbf4RzwRZu3tTH18e902S2fNMuxSbEvRTWk4aAsRgXlhS0cO3s36zLrjZk6tqVRrQi97zo5i6uSKL6cKOF5eOmVKckFTHZaWcB5WTZMD4mnY80IFeVXbshCqIyonVoFrlObDAKvAO7qgmtSg3b93qk31sw8Fg7b0Qg4H5QFuSKH/53OKK4OXLXtn+HzxjLZYqYTgG+l5Q== X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2DQAQAqbd9W/61jaINcFoN2bQa6WAENgWkXCoUkSgKBfxQBAQEBAQEBAWMngi2CFAEBAQMBAQEBIAQnHQMLBQsCAQgYAgINGQICJwEJGAENAgQIBwQBHAICh3sIDq9QjykBAQEBAQEEAQEBAQEBGnuFHIF7gUl+hAEaAQEbgko4E4EnBYdYhVh0PYhJhWOCcIIyhE1Lg3mDJYUujlQCHgEBQoQCHi4BAQEEiEY0fgEBAQ X-IronPort-AV: E=Sophos;i="5.22,558,1449550800"; d="scan'208";a="271613761" Received: from nipigon.cs.uoguelph.ca (HELO zcs1.mail.uoguelph.ca) ([131.104.99.173]) by esa-annu.net.uoguelph.ca with ESMTP; 08 Mar 2016 19:26:59 -0500 Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 12D9015F56D; Tue, 8 Mar 2016 19:26:59 -0500 (EST) Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id jn4MzuIk7hlA; Tue, 8 Mar 2016 19:26:57 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 5E91215F571; Tue, 8 Mar 2016 19:26:57 -0500 (EST) X-Virus-Scanned: amavisd-new at zcs1.mail.uoguelph.ca Received: from zcs1.mail.uoguelph.ca ([127.0.0.1]) by localhost (zcs1.mail.uoguelph.ca [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id zZm846OeCGHn; Tue, 8 Mar 2016 19:26:57 -0500 (EST) Received: from zcs1.mail.uoguelph.ca (zcs1.mail.uoguelph.ca [172.17.95.18]) by zcs1.mail.uoguelph.ca (Postfix) with ESMTP id 3976515F56D; Tue, 8 Mar 2016 19:26:57 -0500 (EST) Date: Tue, 8 Mar 2016 19:26:57 -0500 (EST) From: Rick Macklem To: "Robert N. M. Watson" Cc: Ken Merry , Robert Watson , Julian Elischer , fs@freebsd.org, scsi@freebsd.org Message-ID: <2091108840.10124858.1457483217137.JavaMail.zimbra@uoguelph.ca> In-Reply-To: References: <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> Subject: Re: FUSE extended attribute patches available MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.95.10] X-Mailer: Zimbra 8.0.9_GA_6191 (ZimbraWebClient - FF44 (Win)/8.0.9_GA_6191) Thread-Topic: FUSE extended attribute patches available Thread-Index: 8OfjVNO8yB/8BPMtb/zoCjB/754qNA== X-BeenThere: freebsd-scsi@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: SCSI subsystem List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 09 Mar 2016 00:27:08 -0000 Robert N. M. Watson wrote: > Just a quick observation: to avoid application change, you could actually > leave the 'user.' on the front of the strings? It's not harmful, it just > doesn't serve the same function. This might keep documentation more in sy= nc, > etc. >=20 Btw, this internet draft was just published. There is work in progress w.r.= t. NFS support for Linux style extended attributes and this draft might have s= omething to say w.r.t. namepsace? (I haven't looked at it.) draft-ietf-nfsv4-xattrs-02.txt Available via anonymous ftp at ftp.ietf.org (I find it amusing that the eng= ineers of the internet still use anonymous ftp;-). rick > Sent from my iPhone >=20 > > On 7 Mar 2016, at 22:28, Ken Merry wrote: > >=20 > >=20 > >=20 > >> On Mar 7, 2016, at 2:59 AM, Robert Watson wrote: > >>=20 > >> FreeBSD and Linux=E2=80=99s extended-attribute models were inherited f= rom IRIX, as > >> they were introduced to solve the same problems: a place to metadata s= uch > >> as ACLs, MAC labels, capability masks, etc. IRIX had three namespaces: > >> one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80=9D, and =E2= =80=9Csecure=E2=80=9D, reflecting whether or not they > >> were managed by the file owner (or permissions), the privileged root > >> user, or part of the TCB protection mechanism (e.g., for integrity > >> labels). > >>=20 > >> These extended attributes should not be confused with the filesystem > >> feature of the same name in NFSv4, which is sometimes known by the nam= e > >> =E2=80=9Cfile fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in = IRIX/FreeBSD/Linux/HPFS/etc are > >> tuple pairs of names and values intended to be written atomically or > >> updated in place specifically for (shortish) metadata such as ACLs, > >> rather than being complete separate data spaces for I/O (e.g., that co= uld > >> be memory mapped). > >=20 > > It would be nice to have NFSv4 / Solaris style alternate data streams. = ZFS > > handles them already, but I suppose it would take more work to support > > them in UFS. > >=20 > >> In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace mo= del, > >> providing USER and SYSTEM, the former being managed by the file owner > >> (and those given suitable permission), and the latter being used for T= CB > >> mechanisms such as the implementations of MAC labels, ACLs, etc. > >>=20 > >> In Linux, they adopted a more free-form mechanism based on a single > >> combined namespace with a prefix =E2=80=94 e.g., user.FOO, and system.= BAR. Over > >> time it looks like that namespace has been expanded in various > >> filesystem-specific ways. We also have room to expand our namespace, b= ut > >> from the description below, it=E2=80=99s not clear quite what the righ= t mechanism > >> is. > >>=20 > >> One path would be to introduce a new namespace for filesystem-specific > >> attributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS? > >>=20 > >> But I think the key question here is whether the existing namespaces c= an > >> provide the semantics you need. If not, then we likely need a new > >> namespace. But then we get the question as to who controls use of the > >> namespace. Certainly =E2=80=9Cthe filesystem=E2=80=9D is one option, b= ut then you will > >> get inconsistency in approaches between filesystems and applications = =E2=80=94 > >> across various dimensions including protection (who can read/modify > >> them?), allocation (who decides what names should be used for what?), = and > >> semantics (what applications can use them, and who backs them up?). > >>=20 > >> For example: who should be responsible for backing up those attributes= ? > >> For =E2=80=98system=E2=80=99 attributes in FreeBSD, it is assumed that= backup tools will > >> be aware of the services layered over the attributes =E2=80=94 e.g., t= hat they > >> will back up ACLs using the ACL API, rather than backing up the binary > >> EAs holding the ACLs. For =E2=80=98user=E2=80=99 attributes, it is ass= umed that backup > >> tools (e.g., tar) must explicitly preserve them, since they are > >> user-defined and user-managed. For filesystem-specific attributes, som= e > >> other choice will need to be made =E2=80=94 perhaps filesystem-specifi= c backup > >> tools need to know about them? > >>=20 > >> Note that in the Linux EA model, ACLs are actually accessed via the EA > >> system calls, whereas in FreeBSD, ACLs are a first-class citizen in th= e > >> system-call API/ABI, and so user applications don=E2=80=99t treat them= as EAs. We > >> made that choice as filesystems may choose themselves not to represent > >> ACLs as EAs, and they have real semantics visible to the VFS layer. In > >> Linux, I believe they chose to pass them via EAs to narrow the > >> system-call interface for filesystem metadata. Both are legitimate > >> choices, but this could also trigger discussions about whether new > >> attributes are best accessed via the EA interface, or new system calls= . > >> For filesystem-specific attributes, EAs are likely the better way to g= o. > >=20 > > It may be that for at least the purposes of FUSE, we can adequately liv= e > > under the USER namespace. That would allow for arbitrary namespaces th= at > > Linux-centric filesystems create without significant churn in FreeBSD t= o > > support it. > >=20 > > And of course this is only for the front/top end of a FUSE filesystem. > > What the filesystem actually does with the extended attributes that the > > user sets on top is another question altogether. In the case of IBM=E2= =80=99s > > LTFS, it stores extended attributes (without the =E2=80=9Cuser.=E2=80= =9D prefix) in the > > LTFS index, which is an XML file that resides on tape. For other > > filesystems, the answer could also vary significantly. A few that I > > examined in sysutils/fusefs* used extended attributes on the backend > > (usually on a backing filesystem) under Linux only, but not on the fron= t > > (user facing) end. > >=20 > > In order to make arbitrary namespaces in FUSE work in FreeBSD under the > > user namespace, we=E2=80=99ll have to do what Rick was talking about an= d just not > > include the namespace as a prefix when we get/set attributes. This wil= l > > allow using any sort of namespace or attribute name that the FUSE > > filesystem wants to use. > >=20 > > The impact of this, from a porting standpoint, is that the FUSE filesys= tems > > will have to know that on FreeBSD, they cannot/should not expect to see > > the =E2=80=9Cuser.=E2=80=9D namespace prefix, but they might see other = namespace prefixes. > >=20 > > I took a look at the way LTFS and Gluster work with respect to extended > > attributes with MacOS, since it seems that is how MacOS works, and it= =E2=80=99s > > less obvious to me what is going on with Gluster. They=E2=80=99ve got = this > > function: > >=20 > > #ifdef GF_DARWIN_HOST_OS > > static int > > set_xattr_user_namespace_mode (struct posix_private *priv, const char *= str) > > { > > if (strcmp (str, "none") =3D=3D 0) > > priv->xattr_user_namespace =3D XATTR_NONE; > > else if (strcmp (str, "strip") =3D=3D 0) > > priv->xattr_user_namespace =3D XATTR_STRIP; > > else if (strcmp (str, "append") =3D=3D 0) > > priv->xattr_user_namespace =3D XATTR_APPEND; > > else if (strcmp (str, "both") =3D=3D 0) > > priv->xattr_user_namespace =3D XATTR_BOTH; > > else > > return -1; > > return 0; > > } > > #endif > >=20 > > Although it=E2=80=99s not clear that they do anything with values other= than > > XATTR_STRIP. > >=20 > > With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on= Linux, or no prefix > > on Windows and MacOS X, it=E2=80=99s more straightforward. > >=20 > > Ken > >=20 > >=20 > >>=20 > >> Robert > >>=20 > >>> On 7 Mar 2016, at 07:16, Julian Elischer wrote: > >>>=20 > >>> On 5/03/2016 7:06 PM, Rick Macklem wrote: > >>>> Ken Merry wrote: > >>>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module = to support > >>>>> extended attributes: > >>> oh showing off your masochistic side eh? > >>>=20 > >>>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt > >>> I spent an hour beating my head against fuse yesterday. > >>> then realised that it's an old version on our product. We really have= to > >>> get off 8.0 > >>> (hopefully a matter of weeks now to a 10.x switch) > >>> Now all I need is to find a FreeBSD filesystem expert (ZFS/NFS/CIFS/= GFS) > >>> to hire. > >>>=20 > >>>=20 > >>>> The only bit of code I have that might be useful for this patch is: > >>>> case FUSE_GETXATTR: > >>>> case FUSE_LISTXATTR: > >>>> ! /* > >>>> ! * These can have varying response lengths, and 0 length > >>>> ! * isn't necessarily invalid. > >>>> ! */ > >>>> ! err =3D 0; > >>>> *** I came up with this: > >>>> fgin =3D (struct fuse_getxattr_in *) > >>>> ((char *)ftick->tk_ms_fiov.base + > >>>> sizeof(struct fuse_in_header)); > >>>> if (fgin->size =3D=3D 0) > >>>> err =3D (blen =3D=3D sizeof(struct fuse_getxattr_out)) ? = 0 : > >>>> EINVAL; > >>>> else > >>>> err =3D (blen <=3D fgin->size) ? 0 : EINVAL; > >>>> break; > >>>> I think I got the size check right? > >>>>=20 > >>>> The big question is... > >>>> What to do with the NAMESPACE? > >>>> - My code fails for SYSTEM and does USER without prepending "user.". > >>>> (That seemed to be what rwatson@ felt was reasonable. I thought our > >>>> discussion was on a mailing list, but I can't find it.) > >>>> I've cc'd him. Maybe he can comment again. > >>> Is there a standard for extended attributes I should knwo about? > >>> It seems to me that it's a bit like the wild west. > >>> Extended attributes seem to be "every OS for himself". > >>>=20 > >>>>=20 > >>>> - If you stick with prepending "user." or "system." there needs to b= e > >>>> some way to bypass this so that attributes that don't start in "user= ." > >>>> or "system." can be accessed. I've seen "trusted." and "glusterfs." > >>>> on GlusterFS. > >>>> --> Maybe a new namespace called something like "nil" that just bypa= sses > >>>> any USER or SYSTEM checks? > >>>>=20 > >>>> rick > >>>>=20 > >>>>> The patch implements the get/set/delete/list extended attribute > >>>>> methods. The > >>>>> listing code also converts extended attribute lists from the Linux/= FUSE > >>>>> format to the FreeBSD format. For example: > >>>>>=20 > >>>>> # touch foo > >>>>> # ls -la foo > >>>>> -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo > >>>>> # lsextattr user foo > >>>>> foo > >>>>> # setextattr user testattr1 "12345678" foo > >>>>> # lsextattr user foo > >>>>> foo testattr1 > >>>>> # getextattr user testattr1 foo > >>>>> foo 12345678 > >>>>> # setextattr user testattr2 "87654321" foo > >>>>> # lsextattr user foo > >>>>> foo testattr2 testattr1 > >>>>> # rmextattr user testattr1 foo > >>>>> # lsextattr user foo > >>>>> foo testattr2 > >>>>> # getextattr user testattr1 foo > >>>>> getextattr: foo: failed: Attribute not found > >>>>> # getextattr user testattr2 foo > >>>>> foo 87654321 > >>>>>=20 > >>>>>=20 > >>>>> Just to be clear on what this does, it only provides extended attri= bute > >>>>> support to FreeBSD applications if the underlying FUSE filesystem > >>>>> implements > >>>>> FUSE extended attribute support. Many FUSE filesystems don=E2=80= =99t support > >>>>> the > >>>>> extended attribute VFS operations. > >>>>>=20 > >>>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I = have not yet > >>>>> found > >>>>> another FUSE filesystem that supports extended attributes. If anyo= ne > >>>>> knows > >>>>> of one, please let me know so I can try it out. (I looked through = a > >>>>> number > >>>>> of the filesystems in sysutils/fusefs* in the ports tree.) > >>>>>=20 > >>>>> Any feedback is welcome. I=E2=80=99m planning to check this into F= reeBSD/head > >>>>> in the > >>>>> next week or so. > >>>>>=20 > >>>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementati= on to FreeBSD. It > >>>>> works > >>>>> in the standard FUSE mode, and you can also link it into an applica= tion > >>>>> as a > >>>>> library if you don=E2=80=99t want to incur the overhead of running = through > >>>>> FUSE. I > >>>>> haven=E2=80=99t gotten around to packaging it up to go out for test= ing / > >>>>> review. > >>>>>=20 > >>>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newe= r > >>>>> tape > >>>>> drives, and wants to try it out, let me know. I=E2=80=99ll send yo= u the code > >>>>> when > >>>>> I=E2=80=99ve got it at least somewhat ready. This is IBM-specific,= and won=E2=80=99t > >>>>> work > >>>>> on HP tape drives. > >>>>>=20 > >>>>> Ken > >>>>> =E2=80=94 > >>>>> Ken Merry > >>>>> ken@FreeBSD.ORG > >>>>>=20 > >>>>>=20 > >>>>>=20 > >>>>> _______________________________________________ > >>>>> freebsd-fs@freebsd.org mailing list > >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs > >>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.or= g" > >>>> _______________________________________________ > >>>> freebsd-fs@freebsd.org mailing list > >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs > >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org= " > >=20 > >=20 > >=20 > > =E2=80=94 > > Ken Merry > > ken@FreeBSD.ORG > >=20 > >=20 > >=20 >=20