From owner-freebsd-fs@freebsd.org Tue Mar 8 06:38:04 2016 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7D52AAC3952 for ; Tue, 8 Mar 2016 06:38:04 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from mailman.ysv.freebsd.org (mailman.ysv.freebsd.org [IPv6:2001:1900:2254:206a::50:5]) by mx1.freebsd.org (Postfix) with ESMTP id 6A1DC14C6 for ; Tue, 8 Mar 2016 06:38:04 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: by mailman.ysv.freebsd.org (Postfix) id 6A261AC3951; Tue, 8 Mar 2016 06:38:04 +0000 (UTC) Delivered-To: fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 4FBCFAC394F; Tue, 8 Mar 2016 06:38:04 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [198.74.231.69]) by mx1.freebsd.org (Postfix) with ESMTP id 0EA5A14C5; Tue, 8 Mar 2016 06:38:04 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from [10.0.1.9] (host81-157-243-217.range81-157.btcentralplus.com [81.157.243.217]) by cyrus.watson.org (Postfix) with ESMTPSA id BDAB746B64; Tue, 8 Mar 2016 01:38:02 -0500 (EST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: FUSE extended attribute patches available From: "Robert N. M. Watson" X-Mailer: iPhone Mail (13D15) In-Reply-To: Date: Tue, 8 Mar 2016 06:38:00 +0000 Cc: Robert Watson , Julian Elischer , Rick Macklem , fs@freebsd.org, scsi@freebsd.org Content-Transfer-Encoding: quoted-printable Message-Id: References: <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> To: Ken Merry X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 Mar 2016 06:38:04 -0000 Just a quick observation: to avoid application change, you could actually le= ave the 'user.' on the front of the strings? It's not harmful, it just doesn= 't serve the same function. This might keep documentation more in sync, etc.= Sent from my iPhone > On 7 Mar 2016, at 22:28, Ken Merry wrote: >=20 >=20 >=20 >> On Mar 7, 2016, at 2:59 AM, Robert Watson wrote: >>=20 >> FreeBSD and Linux=E2=80=99s extended-attribute models were inherited from= IRIX, as they were introduced to solve the same problems: a place to metada= ta such as ACLs, MAC labels, capability masks, etc. IRIX had three namespace= s: one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80=9D, and =E2=80=9C= secure=E2=80=9D, reflecting whether or not they were managed by the file own= er (or permissions), the privileged root user, or part of the TCB protection= mechanism (e.g., for integrity labels). >>=20 >> These extended attributes should not be confused with the filesystem feat= ure of the same name in NFSv4, which is sometimes known by the name =E2=80=9C= file fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in IRIX/FreeBSD/Li= nux/HPFS/etc are tuple pairs of names and values intended to be written atom= ically or updated in place specifically for (shortish) metadata such as ACLs= , rather than being complete separate data spaces for I/O (e.g., that could b= e memory mapped). >=20 > It would be nice to have NFSv4 / Solaris style alternate data streams. ZFS= handles them already, but I suppose it would take more work to support them= in UFS. >=20 >> In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace model= , providing USER and SYSTEM, the former being managed by the file owner (and= those given suitable permission), and the latter being used for TCB mechani= sms such as the implementations of MAC labels, ACLs, etc. >>=20 >> In Linux, they adopted a more free-form mechanism based on a single combi= ned namespace with a prefix =E2=80=94 e.g., user.FOO, and system.BAR. Over t= ime it looks like that namespace has been expanded in various filesystem-spe= cific ways. We also have room to expand our namespace, but from the descript= ion below, it=E2=80=99s not clear quite what the right mechanism is. >>=20 >> One path would be to introduce a new namespace for filesystem-specific at= tributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS? >>=20 >> But I think the key question here is whether the existing namespaces can p= rovide the semantics you need. If not, then we likely need a new namespace. B= ut then we get the question as to who controls use of the namespace. Certain= ly =E2=80=9Cthe filesystem=E2=80=9D is one option, but then you will get inc= onsistency in approaches between filesystems and applications =E2=80=94 acro= ss various dimensions including protection (who can read/modify them?), allo= cation (who decides what names should be used for what?), and semantics (wha= t applications can use them, and who backs them up?). >>=20 >> For example: who should be responsible for backing up those attributes? Fo= r =E2=80=98system=E2=80=99 attributes in FreeBSD, it is assumed that backup t= ools will be aware of the services layered over the attributes =E2=80=94 e.g= ., that they will back up ACLs using the ACL API, rather than backing up the= binary EAs holding the ACLs. For =E2=80=98user=E2=80=99 attributes, it is a= ssumed that backup tools (e.g., tar) must explicitly preserve them, since th= ey are user-defined and user-managed. For filesystem-specific attributes, so= me other choice will need to be made =E2=80=94 perhaps filesystem-specific b= ackup tools need to know about them? >>=20 >> Note that in the Linux EA model, ACLs are actually accessed via the EA sy= stem calls, whereas in FreeBSD, ACLs are a first-class citizen in the system= -call API/ABI, and so user applications don=E2=80=99t treat them as EAs. We m= ade that choice as filesystems may choose themselves not to represent ACLs a= s EAs, and they have real semantics visible to the VFS layer. In Linux, I be= lieve they chose to pass them via EAs to narrow the system-call interface fo= r filesystem metadata. Both are legitimate choices, but this could also trig= ger discussions about whether new attributes are best accessed via the EA in= terface, or new system calls. For filesystem-specific attributes, EAs are li= kely the better way to go. >=20 > It may be that for at least the purposes of FUSE, we can adequately live u= nder the USER namespace. That would allow for arbitrary namespaces that Lin= ux-centric filesystems create without significant churn in FreeBSD to suppor= t it. >=20 > And of course this is only for the front/top end of a FUSE filesystem. Wh= at the filesystem actually does with the extended attributes that the user s= ets on top is another question altogether. In the case of IBM=E2=80=99s LTFS= , it stores extended attributes (without the =E2=80=9Cuser.=E2=80=9D prefix)= in the LTFS index, which is an XML file that resides on tape. For other fi= lesystems, the answer could also vary significantly. A few that I examined i= n sysutils/fusefs* used extended attributes on the backend (usually on a bac= king filesystem) under Linux only, but not on the front (user facing) end. >=20 > In order to make arbitrary namespaces in FUSE work in FreeBSD under the us= er namespace, we=E2=80=99ll have to do what Rick was talking about and just n= ot include the namespace as a prefix when we get/set attributes. This will a= llow using any sort of namespace or attribute name that the FUSE filesystem w= ants to use. >=20 > The impact of this, from a porting standpoint, is that the FUSE filesystem= s will have to know that on FreeBSD, they cannot/should not expect to see th= e =E2=80=9Cuser.=E2=80=9D namespace prefix, but they might see other namespa= ce prefixes. >=20 > I took a look at the way LTFS and Gluster work with respect to extended at= tributes with MacOS, since it seems that is how MacOS works, and it=E2=80=99= s less obvious to me what is going on with Gluster. They=E2=80=99ve got thi= s function: >=20 > #ifdef GF_DARWIN_HOST_OS > static int > set_xattr_user_namespace_mode (struct posix_private *priv, const char *str= ) > { > if (strcmp (str, "none") =3D=3D 0) > priv->xattr_user_namespace =3D XATTR_NONE; > else if (strcmp (str, "strip") =3D=3D 0) > priv->xattr_user_namespace =3D XATTR_STRIP; > else if (strcmp (str, "append") =3D=3D 0) > priv->xattr_user_namespace =3D XATTR_APPEND; > else if (strcmp (str, "both") =3D=3D 0) > priv->xattr_user_namespace =3D XATTR_BOTH; > else > return -1; > return 0; > } > #endif =20 >=20 > Although it=E2=80=99s not clear that they do anything with values other th= an XATTR_STRIP.=20 >=20 > With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on Li= nux, or no prefix on Windows and MacOS X, it=E2=80=99s more straightforward.= >=20 > Ken >=20 >=20 >>=20 >> Robert >>=20 >>> On 7 Mar 2016, at 07:16, Julian Elischer wrote: >>>=20 >>> On 5/03/2016 7:06 PM, Rick Macklem wrote: >>>> Ken Merry wrote: >>>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module to s= upport >>>>> extended attributes: >>> oh showing off your masochistic side eh? >>>=20 >>>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt >>> I spent an hour beating my head against fuse yesterday. >>> then realised that it's an old version on our product. We really have to= get off 8.0 >>> (hopefully a matter of weeks now to a 10.x switch) >>> Now all I need is to find a FreeBSD filesystem expert (ZFS/NFS/CIFS/GFS= ) to hire. >>>=20 >>>=20 >>>> The only bit of code I have that might be useful for this patch is: >>>> case FUSE_GETXATTR: >>>> case FUSE_LISTXATTR: >>>> ! /* >>>> ! * These can have varying response lengths, and 0 length >>>> ! * isn't necessarily invalid. >>>> ! */ >>>> ! err =3D 0; >>>> *** I came up with this: >>>> fgin =3D (struct fuse_getxattr_in *) >>>> ((char *)ftick->tk_ms_fiov.base + >>>> sizeof(struct fuse_in_header)); >>>> if (fgin->size =3D=3D 0) >>>> err =3D (blen =3D=3D sizeof(struct fuse_getxattr_out)) ? 0 := >>>> EINVAL; >>>> else >>>> err =3D (blen <=3D fgin->size) ? 0 : EINVAL; >>>> break; >>>> I think I got the size check right? >>>>=20 >>>> The big question is... >>>> What to do with the NAMESPACE? >>>> - My code fails for SYSTEM and does USER without prepending "user.". >>>> (That seemed to be what rwatson@ felt was reasonable. I thought our >>>> discussion was on a mailing list, but I can't find it.) >>>> I've cc'd him. Maybe he can comment again. >>> Is there a standard for extended attributes I should knwo about? >>> It seems to me that it's a bit like the wild west. >>> Extended attributes seem to be "every OS for himself". >>>=20 >>>>=20 >>>> - If you stick with prepending "user." or "system." there needs to be >>>> some way to bypass this so that attributes that don't start in "user." >>>> or "system." can be accessed. I've seen "trusted." and "glusterfs." >>>> on GlusterFS. >>>> --> Maybe a new namespace called something like "nil" that just bypasse= s >>>> any USER or SYSTEM checks? >>>>=20 >>>> rick >>>>=20 >>>>> The patch implements the get/set/delete/list extended attribute method= s. The >>>>> listing code also converts extended attribute lists from the Linux/FUS= E >>>>> format to the FreeBSD format. For example: >>>>>=20 >>>>> # touch foo >>>>> # ls -la foo >>>>> -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo >>>>> # lsextattr user foo >>>>> foo >>>>> # setextattr user testattr1 "12345678" foo >>>>> # lsextattr user foo >>>>> foo testattr1 >>>>> # getextattr user testattr1 foo >>>>> foo 12345678 >>>>> # setextattr user testattr2 "87654321" foo >>>>> # lsextattr user foo >>>>> foo testattr2 testattr1 >>>>> # rmextattr user testattr1 foo >>>>> # lsextattr user foo >>>>> foo testattr2 >>>>> # getextattr user testattr1 foo >>>>> getextattr: foo: failed: Attribute not found >>>>> # getextattr user testattr2 foo >>>>> foo 87654321 >>>>>=20 >>>>>=20 >>>>> Just to be clear on what this does, it only provides extended attribut= e >>>>> support to FreeBSD applications if the underlying FUSE filesystem impl= ements >>>>> FUSE extended attribute support. Many FUSE filesystems don=E2=80=99t s= upport the >>>>> extended attribute VFS operations. >>>>>=20 >>>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I hav= e not yet found >>>>> another FUSE filesystem that supports extended attributes. If anyone k= nows >>>>> of one, please let me know so I can try it out. (I looked through a n= umber >>>>> of the filesystems in sysutils/fusefs* in the ports tree.) >>>>>=20 >>>>> Any feedback is welcome. I=E2=80=99m planning to check this into Free= BSD/head in the >>>>> next week or so. >>>>>=20 >>>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementation t= o FreeBSD. It works >>>>> in the standard FUSE mode, and you can also link it into an applicatio= n as a >>>>> library if you don=E2=80=99t want to incur the overhead of running thr= ough FUSE. I >>>>> haven=E2=80=99t gotten around to packaging it up to go out for testing= / review. >>>>>=20 >>>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer t= ape >>>>> drives, and wants to try it out, let me know. I=E2=80=99ll send you t= he code when >>>>> I=E2=80=99ve got it at least somewhat ready. This is IBM-specific, an= d won=E2=80=99t work >>>>> on HP tape drives. >>>>>=20 >>>>> Ken >>>>> =E2=80=94 >>>>> Ken Merry >>>>> ken@FreeBSD.ORG >>>>>=20 >>>>>=20 >>>>>=20 >>>>> _______________________________________________ >>>>> freebsd-fs@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>>> _______________________________________________ >>>> freebsd-fs@freebsd.org mailing list >>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >=20 >=20 >=20 > =E2=80=94=20 > Ken Merry > ken@FreeBSD.ORG >=20 >=20 >=20