From owner-freebsd-hackers@FreeBSD.ORG Tue Nov 26 22:42:22 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 26FD1657; Tue, 26 Nov 2013 22:42:22 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id CFC2A23FF; Tue, 26 Nov 2013 22:42:21 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqEEANcilVKDaFve/2dsb2JhbABZhBKCergJgUF0giUBAQQBI08HBRYYAgINGQIjNgYTG4dUAwkGrkyIcw2IDBeBKYEzihCBOCQBMweCa4FIA4lCjGeMNoIPhTmBaoFcHoEsQg X-IronPort-AV: E=Sophos;i="4.93,778,1378872000"; d="scan'208";a="72602277" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 26 Nov 2013 17:42:15 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 26A73B4061; Tue, 26 Nov 2013 17:42:15 -0500 (EST) Date: Tue, 26 Nov 2013 17:42:15 -0500 (EST) From: Rick Macklem To: Cedric Blancher Message-ID: <1300352912.21429386.1385505735146.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: Alternate Data Stream Support in FreeBSD (was Re: O_XATTR support in FreeBSD?) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.1_GA_2790 (ZimbraWebClient - FF3.0 (Win)/7.2.1_GA_2790) Cc: Freebsd hackers list , Richard Yao , Jordan Hubbard , Pedro Giffuni , Lionel Cons X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Nov 2013 22:42:22 -0000 Cedric Blancher wrote: > On 26 November 2013 13:27, Lionel Cons > wrote: > > On 26 November 2013 11:19, Jordan Hubbard > > wrote: > >> > >> On Nov 26, 2013, at 1:51 AM, Cedric Blancher > >> wrote: > >> > >>> 1. You do not need more syscalls. Solaris uses the plain openat() > >>> syscall for this, with the O_XATTR flag passed to the normal > >>> open()/openat() flags to open a named attribute. Likewise read(), > >>> write(), mmap() etc work, too. > >> > >> I don=E2=80=99t know if I=E2=80=99d go so far as to say =E2=80=9Cyou d= o not need more > >> syscalls=E2=80=9D; > >> there are additional functions for manipulating EAs that go well > >> beyond > >> the Solaris extensions to the directory and file I/O functions. > >> Assuming you > >> want to be able to get/set as well as enumerate or remove EAs, > >> then > >> you might just as well add getxattr(2), listxattr(2), > >> removexattr(2), setxattr(2) > >> too and follow the herd (Linux and OS X, so far). > > > > You mean 'follow the lemmings down into the abyss'? :) > > > >> We=E2=80=99re also glossing over ACLs and where they get to live. I d= on=E2=80=99t > >> know if Robert > >> and friends have stuck them in a separate namespace on FreeBSD or > >> if they=E2=80=99re > >> in system-protected EAs, as they are in OS X, but ACL preservation > >> across > >> serialization / deserialization is just as important as it is for > >> EAs. > > > > Could we first agree what we are talking about, please? I'm a bit > > new > > to this thread, but AFAIK we are talking about the Windows > > Alternate > > Data Streams as they appear in networked filesystem like NFSv4 and > > CIFS and physical filesystems like NTFS, ZFS and Solaris UFS, > > right? > > ACLs have no direct relation to those streams. > > > > The attributes support from Linux has been proven (at least from > > CERNs > > viewpoint) as pretty useless because of their size constrains and > > crappy API (i.e. no mmap(), no sparse support, no normal tools can > > access them, ...) so IMHO the herd to follow is the herd which > > implements at least the following requirements: > > 1) A proper implementation, which includes access using the normal > > system utilities (in Solaris there is the runat(1) utility to > > access > > the hidden directory containing the attribute files, and bash4.3 > > and > > ksh have cd -@ to cwd into the hidden directories containing the > > attribute files. From that point on (inside the hidden directory) > > ls(1) and even chown(1) and chmod(1) work as usual. You can even > > stick > > ZFS and NFSv4 ACLs on the files in the hidden directory containing > > the > > attribute files) > > 2) read(), write() and mmap() access, i.e. the normal POSIX API (of > > course with the minor extension to flag an access to an alternate > > data > > stream or the directory containing the alternate data streams) > > 2) Support in networked filesystems (i.e. NFSv4, CIFS) > > 3) No size restrictions (just to explain, at CERN the alternate > > data > > streams are often precompiled caches or index files of the main > > file's > > contents, and can easily in the TB range) > > 4) Support for sparse data (i.e. SEEK_HOLE and SEEK_DATA) > > 5) More than one implementation available > > > > AFAIK Solaris, Nexenta, Illumos (NFSv4, ZFS, UFS) and Windows > > Alternate Data Streams (CIFS, NTFS) fit these requirements. >=20 > +1 >=20 > Other argument pro-Alternate Data Streams: Alternate Data Streams are > a superset of the Linux extended attributes (and can thus be used to > emulate them in libc), have all their strengths but none of their > weaknesses (like the hideously duplicated vfs apis and the lack of > support in POSIX utilities). >=20 Not exactly, as I understand it. Linux (and FreeBSD) extended attributes support atomic replacement of the entire attribute value. At least for NFSv4 named attributes, this cannot be emulated, since the named attribute is replaced by a "Setattr size=3D0, write offset=3D0" and there is no way to do those 2 operations as one atomic operation. I suppose a syscall could be implemented as an atomic update (in FreeBSD an exclusive vnode lock could be used) for local file systems, but since the VOPs for the atomic size limited extended attrbutes already exist, I don't see emulation as useful. The NFSv4 working group seems to have decided to add support for atomically set extended attributes (Linux, FreeBSD style) to a future minor revision, due to it not being possible to accurately emulate them with named attribut= es. rick > IMHO the Solaris/Illumos/Nexenta solution of O_XATTR provides a > better > integration into the Unix filesystem philosophy (everything is a > file) > and already reached such a common market penetration that common of > the shelf shells like bash and ksh integrated support for them. >=20 > Ced > -- > Cedric Blancher > Institute Pasteur >=20