Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 8 Mar 2016 06:38:00 +0000
From:      "Robert N. M. Watson" <rwatson@freebsd.org>
To:        Ken Merry <ken@freebsd.org>
Cc:        Robert Watson <rwatson@FreeBSD.org>, Julian Elischer <julian@FreeBSD.ORG>, Rick Macklem <rmacklem@uoguelph.ca>, fs@freebsd.org, scsi@freebsd.org
Subject:   Re: FUSE extended attribute patches available
Message-ID:  <CEC3A6BB-88B9-42E8-ABCA-718FA59D1075@freebsd.org>
In-Reply-To: <BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636@freebsd.org>
References:  <CD5FCB90-1952-4014-BBE0-1BFF1EF85E17@freebsd.org> <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> <BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Just a quick observation: to avoid application change, you could actually le=
ave the 'user.' on the front of the strings? It's not harmful, it just doesn=
't serve the same function. This might keep documentation more in sync, etc.=


Sent from my iPhone

> On 7 Mar 2016, at 22:28, Ken Merry <ken@freebsd.org> wrote:
>=20
>=20
>=20
>> On Mar 7, 2016, at 2:59 AM, Robert Watson <rwatson@FreeBSD.org> wrote:
>>=20
>> FreeBSD and Linux=E2=80=99s extended-attribute models were inherited from=
 IRIX, as they were introduced to solve the same problems: a place to metada=
ta such as ACLs, MAC labels, capability masks, etc. IRIX had three namespace=
s: one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80=9D, and =E2=80=9C=
secure=E2=80=9D, reflecting whether or not they were managed by the file own=
er (or permissions), the privileged root user, or part of the TCB protection=
 mechanism (e.g., for integrity labels).
>>=20
>> These extended attributes should not be confused with the filesystem feat=
ure of the same name in NFSv4, which is sometimes known by the name =E2=80=9C=
file fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in IRIX/FreeBSD/Li=
nux/HPFS/etc are tuple pairs of names and values intended to be written atom=
ically or updated in place specifically for (shortish) metadata such as ACLs=
, rather than being complete separate data spaces for I/O (e.g., that could b=
e memory mapped).
>=20
> It would be nice to have NFSv4 / Solaris style alternate data streams.  ZFS=
 handles them already, but I suppose it would take more work to support them=
 in UFS.
>=20
>> In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace model=
, providing USER and SYSTEM, the former being managed by the file owner (and=
 those given suitable permission), and the latter being used for TCB mechani=
sms such as the implementations of MAC labels, ACLs, etc.
>>=20
>> In Linux, they adopted a more free-form mechanism based on a single combi=
ned namespace with a prefix =E2=80=94 e.g., user.FOO, and system.BAR. Over t=
ime it looks like that namespace has been expanded in various filesystem-spe=
cific ways. We also have room to expand our namespace, but from the descript=
ion below, it=E2=80=99s not clear quite what the right mechanism is.
>>=20
>> One path would be to introduce a new namespace for filesystem-specific at=
tributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS?
>>=20
>> But I think the key question here is whether the existing namespaces can p=
rovide the semantics you need. If not, then we likely need a new namespace. B=
ut then we get the question as to who controls use of the namespace. Certain=
ly =E2=80=9Cthe filesystem=E2=80=9D is one option, but then you will get inc=
onsistency in approaches between filesystems and applications =E2=80=94 acro=
ss various dimensions including protection (who can read/modify them?), allo=
cation (who decides what names should be used for what?), and semantics (wha=
t applications can use them, and who backs them up?).
>>=20
>> For example: who should be responsible for backing up those attributes? Fo=
r =E2=80=98system=E2=80=99 attributes in FreeBSD, it is assumed that backup t=
ools will be aware of the services layered over the attributes =E2=80=94 e.g=
., that they will back up ACLs using the ACL API, rather than backing up the=
 binary EAs holding the ACLs. For =E2=80=98user=E2=80=99 attributes, it is a=
ssumed that backup tools (e.g., tar) must explicitly preserve them, since th=
ey are user-defined and user-managed. For filesystem-specific attributes, so=
me other choice will need to be made =E2=80=94 perhaps filesystem-specific b=
ackup tools need to know about them?
>>=20
>> Note that in the Linux EA model, ACLs are actually accessed via the EA sy=
stem calls, whereas in FreeBSD, ACLs are a first-class citizen in the system=
-call API/ABI, and so user applications don=E2=80=99t treat them as EAs. We m=
ade that choice as filesystems may choose themselves not to represent ACLs a=
s EAs, and they have real semantics visible to the VFS layer. In Linux, I be=
lieve they chose to pass them via EAs to narrow the system-call interface fo=
r filesystem metadata. Both are legitimate choices, but this could also trig=
ger discussions about whether new attributes are best accessed via the EA in=
terface, or new system calls. For filesystem-specific attributes, EAs are li=
kely the better way to go.
>=20
> It may be that for at least the purposes of FUSE, we can adequately live u=
nder the USER namespace.  That would allow for arbitrary namespaces that Lin=
ux-centric filesystems create without significant churn in FreeBSD to suppor=
t it.
>=20
> And of course this is only for the front/top end of a FUSE filesystem.  Wh=
at the filesystem actually does with the extended attributes that the user s=
ets on top is another question altogether.  In the case of IBM=E2=80=99s LTFS=
, it stores extended attributes (without the =E2=80=9Cuser.=E2=80=9D prefix)=
 in the LTFS index, which is an XML file that resides on tape.  For other fi=
lesystems, the answer could also vary significantly.  A few that I examined i=
n sysutils/fusefs* used extended attributes on the backend (usually on a bac=
king filesystem) under Linux only, but not on the front (user facing) end.
>=20
> In order to make arbitrary namespaces in FUSE work in FreeBSD under the us=
er namespace, we=E2=80=99ll have to do what Rick was talking about and just n=
ot include the namespace as a prefix when we get/set attributes.  This will a=
llow using any sort of namespace or attribute name that the FUSE filesystem w=
ants to use.
>=20
> The impact of this, from a porting standpoint, is that the FUSE filesystem=
s will have to know that on FreeBSD, they cannot/should not expect to see th=
e =E2=80=9Cuser.=E2=80=9D namespace prefix, but they might see other namespa=
ce prefixes.
>=20
> I took a look at the way LTFS and Gluster work with respect to extended at=
tributes with MacOS, since it seems that is how MacOS works, and it=E2=80=99=
s less obvious to me what is going on with Gluster.  They=E2=80=99ve got thi=
s function:
>=20
> #ifdef GF_DARWIN_HOST_OS
> static int
> set_xattr_user_namespace_mode (struct posix_private *priv, const char *str=
)
> {
>        if (strcmp (str, "none") =3D=3D 0)
>                priv->xattr_user_namespace =3D XATTR_NONE;
>        else if (strcmp (str, "strip") =3D=3D 0)
>                priv->xattr_user_namespace =3D XATTR_STRIP;
>        else if (strcmp (str, "append") =3D=3D 0)
>                priv->xattr_user_namespace =3D XATTR_APPEND;
>        else if (strcmp (str, "both") =3D=3D 0)
>                priv->xattr_user_namespace =3D XATTR_BOTH;
>        else
>                return -1;
>        return 0;
> }
> #endif  =20
>=20
> Although it=E2=80=99s not clear that they do anything with values other th=
an XATTR_STRIP.=20
>=20
> With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on Li=
nux, or no prefix on Windows and MacOS X, it=E2=80=99s more straightforward.=

>=20
> Ken
>=20
>=20
>>=20
>> Robert
>>=20
>>> On 7 Mar 2016, at 07:16, Julian Elischer <julian@FreeBSD.ORG> wrote:
>>>=20
>>> On 5/03/2016 7:06 PM, Rick Macklem wrote:
>>>> Ken Merry wrote:
>>>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module to s=
upport
>>>>> extended attributes:
>>> oh showing off your masochistic side eh?
>>>=20
>>>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt
>>> I spent an hour beating my head against fuse yesterday.
>>> then realised that it's an old version on our product. We really have to=
 get off 8.0
>>> (hopefully a matter of weeks now to a 10.x switch)
>>> Now all I need is to find  a FreeBSD filesystem expert (ZFS/NFS/CIFS/GFS=
) to hire.
>>>=20
>>>=20
>>>> The only bit of code I have that might be useful for this patch is:
>>>>    case FUSE_GETXATTR:
>>>>    case FUSE_LISTXATTR:
>>>> !        /*
>>>> !         * These can have varying response lengths, and 0 length
>>>> !         * isn't necessarily invalid.
>>>> !         */
>>>> !        err =3D 0;
>>>> *** I came up with this:
>>>>        fgin =3D (struct fuse_getxattr_in *)
>>>>            ((char *)ftick->tk_ms_fiov.base +
>>>>             sizeof(struct fuse_in_header));
>>>>        if (fgin->size =3D=3D 0)
>>>>            err =3D (blen =3D=3D sizeof(struct fuse_getxattr_out)) ? 0 :=

>>>>                EINVAL;
>>>>        else
>>>>            err =3D (blen <=3D fgin->size) ? 0 : EINVAL;
>>>>        break;
>>>> I think I got the size check right?
>>>>=20
>>>> The big question is...
>>>> What to do with the NAMESPACE?
>>>> - My code fails for SYSTEM and does USER without prepending "user.".
>>>> (That seemed to be what rwatson@ felt was reasonable. I thought our
>>>>  discussion was on a mailing list, but I can't find it.)
>>>> I've cc'd him. Maybe he can comment again.
>>> Is there  a standard for extended attributes I should knwo about?
>>> It seems to me that it's a bit like the wild west.
>>> Extended attributes seem to be "every OS for himself".
>>>=20
>>>>=20
>>>> - If you stick with prepending "user." or "system." there needs to be
>>>> some way to bypass this so that attributes that don't start in "user."
>>>> or "system." can be accessed. I've seen "trusted." and "glusterfs."
>>>> on GlusterFS.
>>>> --> Maybe a new namespace called something like "nil" that just bypasse=
s
>>>>     any USER or SYSTEM checks?
>>>>=20
>>>> rick
>>>>=20
>>>>> The patch implements the get/set/delete/list extended attribute method=
s.  The
>>>>> listing code also converts extended attribute lists from the Linux/FUS=
E
>>>>> format to the FreeBSD format.  For example:
>>>>>=20
>>>>> # touch foo
>>>>> # ls -la foo
>>>>> -rwxrwxrwx  1 root  wheel  0 Feb 29 21:40 foo
>>>>> # lsextattr user foo
>>>>> foo
>>>>> # setextattr user testattr1 "12345678" foo
>>>>> # lsextattr user foo
>>>>> foo     testattr1
>>>>> # getextattr user testattr1 foo
>>>>> foo     12345678
>>>>> # setextattr user testattr2 "87654321" foo
>>>>> # lsextattr user foo
>>>>> foo     testattr2       testattr1
>>>>> # rmextattr user testattr1 foo
>>>>> # lsextattr user foo
>>>>> foo     testattr2
>>>>> # getextattr user testattr1 foo
>>>>> getextattr: foo: failed: Attribute not found
>>>>> # getextattr user testattr2 foo
>>>>> foo     87654321
>>>>>=20
>>>>>=20
>>>>> Just to be clear on what this does, it only provides extended attribut=
e
>>>>> support to FreeBSD applications if the underlying FUSE filesystem impl=
ements
>>>>> FUSE extended attribute support.  Many FUSE filesystems don=E2=80=99t s=
upport the
>>>>> extended attribute VFS operations.
>>>>>=20
>>>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I hav=
e not yet found
>>>>> another FUSE filesystem that supports extended attributes.  If anyone k=
nows
>>>>> of one, please let me know so I can try it out.  (I looked through a n=
umber
>>>>> of the filesystems in sysutils/fusefs* in the ports tree.)
>>>>>=20
>>>>> Any feedback is welcome.  I=E2=80=99m planning to check this into Free=
BSD/head in the
>>>>> next week or so.
>>>>>=20
>>>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS implementation t=
o FreeBSD.  It works
>>>>> in the standard FUSE mode, and you can also link it into an applicatio=
n as a
>>>>> library if you don=E2=80=99t want to incur the overhead of running thr=
ough FUSE.  I
>>>>> haven=E2=80=99t gotten around to packaging it up to go out for testing=
 / review.
>>>>>=20
>>>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer t=
ape
>>>>> drives, and wants to try it out, let me know.  I=E2=80=99ll send you t=
he code when
>>>>> I=E2=80=99ve got it at least somewhat ready.  This is IBM-specific, an=
d won=E2=80=99t work
>>>>> on HP tape drives.
>>>>>=20
>>>>> Ken
>>>>> =E2=80=94
>>>>> Ken Merry
>>>>> ken@FreeBSD.ORG
>>>>>=20
>>>>>=20
>>>>>=20
>>>>> _______________________________________________
>>>>> freebsd-fs@freebsd.org mailing list
>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>>>> _______________________________________________
>>>> freebsd-fs@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
>=20
>=20
>=20
> =E2=80=94=20
> Ken Merry
> ken@FreeBSD.ORG
>=20
>=20
>=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CEC3A6BB-88B9-42E8-ABCA-718FA59D1075>