Date: Mon, 7 Mar 2016 07:59:33 +0000 From: Robert Watson <rwatson@FreeBSD.org> To: Julian Elischer <julian@FreeBSD.ORG> Cc: Rick Macklem <rmacklem@uoguelph.ca>, Ken Merry <ken@freebsd.org>, fs@freebsd.org, scsi@freebsd.org Subject: Re: FUSE extended attribute patches available Message-ID: <6AF0FC23-CC34-43EA-A008-9FB82FB21558@FreeBSD.org> In-Reply-To: <56DD2AB6.1030407@freebsd.org> References: <CD5FCB90-1952-4014-BBE0-1BFF1EF85E17@freebsd.org> <800018199.6694281.1457233600357.JavaMail.zimbra@uoguelph.ca> <56DD2AB6.1030407@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
FreeBSD and Linux=E2=80=99s extended-attribute models were inherited = from IRIX, as they were introduced to solve the same problems: a place = to metadata such as ACLs, MAC labels, capability masks, etc. IRIX had = three namespaces: one each for =E2=80=9Cuser=E2=80=9D, =E2=80=9Croot=E2=80= =9D, and =E2=80=9Csecure=E2=80=9D, reflecting whether or not they were = managed by the file owner (or permissions), the privileged root user, or = part of the TCB protection mechanism (e.g., for integrity labels). These extended attributes should not be confused with the filesystem = feature of the same name in NFSv4, which is sometimes known by the name = =E2=80=9Cfile fork=E2=80=9D or =E2=80=9Cdata streams=E2=80=9D. EAs in = IRIX/FreeBSD/Linux/HPFS/etc are tuple pairs of names and values intended = to be written atomically or updated in place specifically for (shortish) = metadata such as ACLs, rather than being complete separate data spaces = for I/O (e.g., that could be memory mapped). In FreeBSD=E2=80=99s design, we incorporated the disjoint namespace = model, providing USER and SYSTEM, the former being managed by the file = owner (and those given suitable permission), and the latter being used = for TCB mechanisms such as the implementations of MAC labels, ACLs, etc. In Linux, they adopted a more free-form mechanism based on a single = combined namespace with a prefix =E2=80=94 e.g., user.FOO, and = system.BAR. Over time it looks like that namespace has been expanded in = various filesystem-specific ways. We also have room to expand our = namespace, but from the description below, it=E2=80=99s not clear quite = what the right mechanism is. One path would be to introduce a new namespace for filesystem-specific = attributes =E2=80=94 e.g., EXTATTR_NAMESPACE_FS? But I think the key question here is whether the existing namespaces can = provide the semantics you need. If not, then we likely need a new = namespace. But then we get the question as to who controls use of the = namespace. Certainly =E2=80=9Cthe filesystem=E2=80=9D is one option, but = then you will get inconsistency in approaches between filesystems and = applications =E2=80=94 across various dimensions including protection = (who can read/modify them?), allocation (who decides what names should = be used for what?), and semantics (what applications can use them, and = who backs them up?). For example: who should be responsible for backing up those attributes? = For =E2=80=98system=E2=80=99 attributes in FreeBSD, it is assumed that = backup tools will be aware of the services layered over the attributes = =E2=80=94 e.g., that they will back up ACLs using the ACL API, rather = than backing up the binary EAs holding the ACLs. For =E2=80=98user=E2=80=99= attributes, it is assumed that backup tools (e.g., tar) must explicitly = preserve them, since they are user-defined and user-managed. For = filesystem-specific attributes, some other choice will need to be made = =E2=80=94 perhaps filesystem-specific backup tools need to know about = them? Note that in the Linux EA model, ACLs are actually accessed via the EA = system calls, whereas in FreeBSD, ACLs are a first-class citizen in the = system-call API/ABI, and so user applications don=E2=80=99t treat them = as EAs. We made that choice as filesystems may choose themselves not to = represent ACLs as EAs, and they have real semantics visible to the VFS = layer. In Linux, I believe they chose to pass them via EAs to narrow the = system-call interface for filesystem metadata. Both are legitimate = choices, but this could also trigger discussions about whether new = attributes are best accessed via the EA interface, or new system calls. = For filesystem-specific attributes, EAs are likely the better way to go. Robert > On 7 Mar 2016, at 07:16, Julian Elischer <julian@FreeBSD.ORG> wrote: >=20 > On 5/03/2016 7:06 PM, Rick Macklem wrote: >> Ken Merry wrote: >>> I have patches for FreeBSD=E2=80=99s FUSE filesystem kernel module = to support >>> extended attributes: > oh showing off your masochistic side eh? >=20 >>> https://people.freebsd.org/~ken/fuse_extattr.20160229.1.txt >>>=20 > I spent an hour beating my head against fuse yesterday. > then realised that it's an old version on our product. We really have = to get off 8.0 > (hopefully a matter of weeks now to a 10.x switch) > Now all I need is to find a FreeBSD filesystem expert = (ZFS/NFS/CIFS/GFS) to hire. >=20 >=20 >> The only bit of code I have that might be useful for this patch is: >> case FUSE_GETXATTR: >> case FUSE_LISTXATTR: >> ! /* >> ! * These can have varying response lengths, and 0 length >> ! * isn't necessarily invalid. >> ! */ >> ! err =3D 0; >> *** I came up with this: >> fgin =3D (struct fuse_getxattr_in *) >> ((char *)ftick->tk_ms_fiov.base + >> sizeof(struct fuse_in_header)); >> if (fgin->size =3D=3D 0) >> err =3D (blen =3D=3D sizeof(struct = fuse_getxattr_out)) ? 0 : >> EINVAL; >> else >> err =3D (blen <=3D fgin->size) ? 0 : EINVAL; >> break; >> I think I got the size check right? >>=20 >> The big question is... >> What to do with the NAMESPACE? >> - My code fails for SYSTEM and does USER without prepending "user.". >> (That seemed to be what rwatson@ felt was reasonable. I thought our >> discussion was on a mailing list, but I can't find it.) >> I've cc'd him. Maybe he can comment again. > Is there a standard for extended attributes I should knwo about? > It seems to me that it's a bit like the wild west. > Extended attributes seem to be "every OS for himself". >=20 >>=20 >> - If you stick with prepending "user." or "system." there needs to be >> some way to bypass this so that attributes that don't start in = "user." >> or "system." can be accessed. I've seen "trusted." and "glusterfs." >> on GlusterFS. >> --> Maybe a new namespace called something like "nil" that just = bypasses >> any USER or SYSTEM checks? >>=20 >> rick >>=20 >>> The patch implements the get/set/delete/list extended attribute = methods. The >>> listing code also converts extended attribute lists from the = Linux/FUSE >>> format to the FreeBSD format. For example: >>>=20 >>> # touch foo >>> # ls -la foo >>> -rwxrwxrwx 1 root wheel 0 Feb 29 21:40 foo >>> # lsextattr user foo >>> foo >>> # setextattr user testattr1 "12345678" foo >>> # lsextattr user foo >>> foo testattr1 >>> # getextattr user testattr1 foo >>> foo 12345678 >>> # setextattr user testattr2 "87654321" foo >>> # lsextattr user foo >>> foo testattr2 testattr1 >>> # rmextattr user testattr1 foo >>> # lsextattr user foo >>> foo testattr2 >>> # getextattr user testattr1 foo >>> getextattr: foo: failed: Attribute not found >>> # getextattr user testattr2 foo >>> foo 87654321 >>>=20 >>>=20 >>> Just to be clear on what this does, it only provides extended = attribute >>> support to FreeBSD applications if the underlying FUSE filesystem = implements >>> FUSE extended attribute support. Many FUSE filesystems don=E2=80=99t = support the >>> extended attribute VFS operations. >>>=20 >>> I have tested this out on IBM=E2=80=99s LTFS implementation, but I = have not yet found >>> another FUSE filesystem that supports extended attributes. If = anyone knows >>> of one, please let me know so I can try it out. (I looked through a = number >>> of the filesystems in sysutils/fusefs* in the ports tree.) >>>=20 >>> Any feedback is welcome. I=E2=80=99m planning to check this into = FreeBSD/head in the >>> next week or so. >>>=20 >>> Obviously, I=E2=80=99ve also ported IBM=E2=80=99s LTFS = implementation to FreeBSD. It works >>> in the standard FUSE mode, and you can also link it into an = application as a >>> library if you don=E2=80=99t want to incur the overhead of running = through FUSE. I >>> haven=E2=80=99t gotten around to packaging it up to go out for = testing / review. >>>=20 >>> If anyone has IBM LTO-5 or newer tape drives, or IBM TS1140 or newer = tape >>> drives, and wants to try it out, let me know. I=E2=80=99ll send you = the code when >>> I=E2=80=99ve got it at least somewhat ready. This is IBM-specific, = and won=E2=80=99t work >>> on HP tape drives. >>>=20 >>> Ken >>> =E2=80=94 >>> Ken Merry >>> ken@FreeBSD.ORG >>>=20 >>>=20 >>>=20 >>> _______________________________________________ >>> freebsd-fs@freebsd.org mailing list >>> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >>> To unsubscribe, send any mail to = "freebsd-fs-unsubscribe@freebsd.org" >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" >>=20 >>=20 >=20
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?6AF0FC23-CC34-43EA-A008-9FB82FB21558>