Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 7 Mar 2016 17:28:16 -0500
From:      Ken Merry <ken@freebsd.org>
To:        Robert Watson <rwatson@FreeBSD.org>
Cc:        Julian Elischer <julian@FreeBSD.ORG>, Rick Macklem <rmacklem@uoguelph.ca>,  fs@freebsd.org, scsi@freebsd.org
Subject:   Re: FUSE extended attribute patches available
Message-ID:  <BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636@freebsd.org>
In-Reply-To: <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org>
References:  <CD5FCB90-1952-4014-BBE0-1BFF1EF85E17@freebsd.org> <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org> <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help


> On Mar 7, 2016, at 2:59 AM, Robert Watson <rwatson@FreeBSD.org> wrote:
>=20
> FreeBSD and Linux=E2=80=99s extended-attribute models were inherited =
from IRIX, as they were introduced to solve the same problems: a place =
to metadata such as ACLs, MAC labels, capability masks, etc. IRIX had =
three namespaces: one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80=
=9D, and =E2=80=9Csecure=E2=80=9D, reflecting whether or not they were =
managed by the file owner (or permissions), the privileged root user, or =
part of the TCB protection mechanism (e.g., for integrity labels).
>=20
> These extended attributes should not be confused with the filesystem =
feature of the same name in NFSv4, which is sometimes known by the name =
=E2=80=9Cfile fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in =
IRIX/FreeBSD/Linux/HPFS/etc are tuple pairs of names and values intended =
to be written atomically or updated in place specifically for (shortish) =
metadata such as ACLs, rather than being complete separate data spaces =
for I/O (e.g., that could be memory mapped).

It would be nice to have NFSv4 / Solaris style alternate data streams.  =
ZFS handles them already, but I suppose it would take more work to =
support them in UFS.

> In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace =
model, providing USER and SYSTEM, the former being managed by the file =
owner (and those given suitable permission), and the latter being used =
for TCB mechanisms such as the implementations of MAC labels, ACLs, etc.
>=20
> In Linux, they adopted a more free-form mechanism based on a single =
combined namespace with a prefix =E2=80=94 e.g., user.FOO, and =
system.BAR. Over time it looks like that namespace has been expanded in =
various filesystem-specific ways. We also have room to expand our =
namespace, but from the description below, it=E2=80=99s not clear quite =
what the right mechanism is.
>=20
> One path would be to introduce a new namespace for filesystem-specific =
attributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS?
>=20
> But I think the key question here is whether the existing namespaces =
can provide the semantics you need. If not, then we likely need a new =
namespace. But then we get the question as to who controls use of the =
namespace. Certainly =E2=80=9Cthe filesystem=E2=80=9D is one option, but =
then you will get inconsistency in approaches between filesystems and =
applications =E2=80=94 across various dimensions including protection =
(who can read/modify them?), allocation (who decides what names should =
be used for what?), and semantics (what applications can use them, and =
who backs them up?).
>=20
> For example: who should be responsible for backing up those =
attributes? For =E2=80=98system=E2=80=99 attributes in FreeBSD, it is =
assumed that backup tools will be aware of the services layered over the =
attributes =E2=80=94 e.g., that they will back up ACLs using the ACL =
API, rather than backing up the binary EAs holding the ACLs. For =
=E2=80=98user=E2=80=99 attributes, it is assumed that backup tools =
(e.g., tar) must explicitly preserve them, since they are user-defined =
and user-managed. For filesystem-specific attributes, some other choice =
will need to be made =E2=80=94 perhaps filesystem-specific backup tools =
need to know about them?
>=20
> Note that in the Linux EA model, ACLs are actually accessed via the EA =
system calls, whereas in FreeBSD, ACLs are a first-class citizen in the =
system-call API/ABI, and so user applications don=E2=80=99t treat them =
as EAs. We made that choice as filesystems may choose themselves not to =
represent ACLs as EAs, and they have real semantics visible to the VFS =
layer. In Linux, I believe they chose to pass them via EAs to narrow the =
system-call interface for filesystem metadata. Both are legitimate =
choices, but this could also trigger discussions about whether new =
attributes are best accessed via the EA interface, or new system calls. =
For filesystem-specific attributes, EAs are likely the better way to go.

It may be that for at least the purposes of FUSE, we can adequately live =
under the USER namespace.  That would allow for arbitrary namespaces =
that Linux-centric filesystems create without significant churn in =
FreeBSD to support it.

And of course this is only for the front/top end of a FUSE filesystem.  =
What the filesystem actually does with the extended attributes that the =
user sets on top is another question altogether.  In the case of IBM=E2=80=
=99s LTFS, it stores extended attributes (without the =E2=80=9Cuser.=E2=80=
=9D prefix) in the LTFS index, which is an XML file that resides on =
tape.  For other filesystems, the answer could also vary significantly.  =
A few that I examined in sysutils/fusefs* used extended attributes on =
the backend (usually on a backing filesystem) under Linux only, but not =
on the front (user facing) end.

In order to make arbitrary namespaces in FUSE work in FreeBSD under the =
user namespace, we=E2=80=99ll have to do what Rick was talking about and =
just not include the namespace as a prefix when we get/set attributes.  =
This will allow using any sort of namespace or attribute name that the =
FUSE filesystem wants to use.

The impact of this, from a porting standpoint, is that the FUSE =
filesystems will have to know that on FreeBSD, they cannot/should not =
expect to see the =E2=80=9Cuser.=E2=80=9D namespace prefix, but they =
might see other namespace prefixes.

I took a look at the way LTFS and Gluster work with respect to extended =
attributes with MacOS, since it seems that is how MacOS works, and =
it=E2=80=99s less obvious to me what is going on with Gluster.  =
They=E2=80=99ve got this function:

#ifdef GF_DARWIN_HOST_OS
static int
set_xattr_user_namespace_mode (struct posix_private *priv, const char =
*str)
{
        if (strcmp (str, "none") =3D=3D 0)
                priv->xattr_user_namespace =3D XATTR_NONE;
        else if (strcmp (str, "strip") =3D=3D 0)
                priv->xattr_user_namespace =3D XATTR_STRIP;
        else if (strcmp (str, "append") =3D=3D 0)
                priv->xattr_user_namespace =3D XATTR_APPEND;
        else if (strcmp (str, "both") =3D=3D 0)
                priv->xattr_user_namespace =3D XATTR_BOTH;
        else
                return -1;
        return 0;
}
#endif  =20

Although it=E2=80=99s not clear that they do anything with values other =
than XATTR_STRIP.=20

With LTFS, since they either assume a =E2=80=9Cuser.=E2=80=9D prefix on =
Linux, or no prefix on Windows and MacOS X, it=E2=80=99s more =
straightforward.

Ken


>=20
> Robert
>=20
>> On 7 Mar 2016, at 07:16, Julian Elischer <julian@FreeBSD.ORG> wrote:
>>=20
>> On 5/03/2016 7:06 PM, Rick Macklem wrote:
>>> Ken Merry wrote:
>>>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module =
to support
>>>> extended attributes:
>> oh showing off your masochistic side eh?
>>=20
>>>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt
>>>>=20
>> I spent an hour beating my head against fuse yesterday.
>> then realised that it's an old version on our product. We really have =
to get off 8.0
>> (hopefully a matter of weeks now to a 10.x switch)
>> Now all I need is to find  a FreeBSD filesystem expert =
(ZFS/NFS/CIFS/GFS) to hire.
>>=20
>>=20
>>> The only bit of code I have that might be useful for this patch is:
>>>  	case FUSE_GETXATTR:
>>>  	case FUSE_LISTXATTR:
>>> ! 		/*
>>> ! 		 * These can have varying response lengths, and 0 length
>>> ! 		 * isn't necessarily invalid.
>>> ! 		 */
>>> ! 		err =3D 0;
>>> *** I came up with this:
>>> 		fgin =3D (struct fuse_getxattr_in *)
>>> 		    ((char *)ftick->tk_ms_fiov.base +
>>> 		     sizeof(struct fuse_in_header));
>>> 		if (fgin->size =3D=3D 0)
>>> 			err =3D (blen =3D=3D sizeof(struct =
fuse_getxattr_out)) ? 0 :
>>> 			    EINVAL;
>>> 		else
>>> 			err =3D (blen <=3D fgin->size) ? 0 : EINVAL;
>>>  		break;
>>> I think I got the size check right?
>>>=20
>>> The big question is...
>>> What to do with the NAMESPACE?
>>> - My code fails for SYSTEM and does USER without prepending "user.".
>>>  (That seemed to be what rwatson@ felt was reasonable. I thought our
>>>   discussion was on a mailing list, but I can't find it.)
>>>  I've cc'd him. Maybe he can comment again.
>> Is there  a standard for extended attributes I should knwo about?
>> It seems to me that it's a bit like the wild west.
>> Extended attributes seem to be "every OS for himself".
>>=20
>>>=20
>>> - If you stick with prepending "user." or "system." there needs to =
be
>>>  some way to bypass this so that attributes that don't start in =
"user."
>>>  or "system." can be accessed. I've seen "trusted." and "glusterfs."
>>>  on GlusterFS.
>>>  --> Maybe a new namespace called something like "nil" that just =
bypasses
>>>      any USER or SYSTEM checks?
>>>=20
>>> rick
>>>=20
>>>> The patch implements the get/set/delete/list extended attribute =
methods.  The
>>>> listing code also converts extended attribute lists from the =
Linux/FUSE
>>>> format to the FreeBSD format.  For example:
>>>>=20
>>>> # touch foo
>>>> # ls -la foo
>>>> -rwxrwxrwx  1 root  wheel  0 Feb 29 21:40 foo
>>>> # lsextattr user foo
>>>> foo
>>>> # setextattr user testattr1 "12345678" foo
>>>> # lsextattr user foo
>>>> foo     testattr1
>>>> # getextattr user testattr1 foo
>>>> foo     12345678
>>>> # setextattr user testattr2 "87654321" foo
>>>> # lsextattr user foo
>>>> foo     testattr2       testattr1
>>>> # rmextattr user testattr1 foo
>>>> # lsextattr user foo
>>>> foo     testattr2
>>>> # getextattr user testattr1 foo
>>>> getextattr: foo: failed: Attribute not found
>>>> # getextattr user testattr2 foo
>>>> foo     87654321
>>>>=20
>>>>=20
>>>> Just to be clear on what this does, it only provides extended =
attribute
>>>> support to FreeBSD applications if the underlying FUSE filesystem =
implements
>>>> FUSE extended attribute support.  Many FUSE filesystems don=E2=80=99t=
 support the
>>>> extended attribute VFS operations.
>>>>=20
>>>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I =
have not yet found
>>>> another FUSE filesystem that supports extended attributes.  If =
anyone knows
>>>> of one, please let me know so I can try it out.  (I looked through =
a number
>>>> of the filesystems in sysutils/fusefs* in the ports tree.)
>>>>=20
>>>> Any feedback is welcome.  I=E2=80=99m planning to check this into =
FreeBSD/head in the
>>>> next week or so.
>>>>=20
>>>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS =
implementation to FreeBSD.  It works
>>>> in the standard FUSE mode, and you can also link it into an =
application as a
>>>> library if you don=E2=80=99t want to incur the overhead of running =
through FUSE.  I
>>>> haven=E2=80=99t gotten around to packaging it up to go out for =
testing / review.
>>>>=20
>>>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or =
newer tape
>>>> drives, and wants to try it out, let me know.  I=E2=80=99ll send =
you the code when
>>>> I=E2=80=99ve got it at least somewhat ready.  This is IBM-specific, =
and won=E2=80=99t work
>>>> on HP tape drives.
>>>>=20
>>>> Ken
>>>> =E2=80=94
>>>> Ken Merry
>>>> ken@FreeBSD.ORG
>>>>=20
>>>>=20
>>>>=20
>>>> _______________________________________________
>>>> freebsd-fs@freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>>> To unsubscribe, send any mail to =
"freebsd-fs-unsubscribe@freebsd.org"
>>> _______________________________________________
>>> freebsd-fs@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs
>>> To unsubscribe, send any mail to =
"freebsd-fs-unsubscribe@freebsd.org"
>>>=20
>>>=20
>>=20
>=20



=E2=80=94=20
Ken Merry
ken@FreeBSD.ORG






Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BBF1EEE5-A6A9-46A0-B5E5-9FFD90631636>