Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Jan 2012 22:31:51 +1100 (EST)
From:      Bruce Evans <brde@optusnet.com.au>
To:        Jilles Tjoelker <jilles@stack.nl>
Cc:        flo@FreeBSD.org, Giovanni Trematerra <gianni@FreeBSD.org>, Attilio Rao <attilio@FreeBSD.org>, Konstantin Belousov <kib@FreeBSD.org>, freebsd-arch@FreeBSD.org
Subject:   Re: pipe/fifo code merged.
Message-ID:  <20120129215808.V844@besplex.bde.org>
In-Reply-To: <20120128233712.GA92694@stack.nl>
References:  <CACfq093o9iVZKxCj58OR2hpCLDYTUTdxg_re_bEMYn2SrNrLCQ@mail.gmail.com> <20120110005155.S2378@besplex.bde.org> <CACfq09225iMYLe6p8jSiVhsDw_rqTyEHsvPdtZXLrQYT0-skzg@mail.gmail.com> <20120110153807.H943@besplex.bde.org> <CACfq092F%2BjeWT5SiQe-vy-oeFRYMtiWLergdBVMasVrpFU2Vpw@mail.gmail.com> <20120128233712.GA92694@stack.nl>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 29 Jan 2012, Jilles Tjoelker wrote:

> On Tue, Jan 17, 2012 at 12:43:19PM +0100, Giovanni Trematerra wrote:
>> Did you agree that this patch
>> http://www.trematerra.net/patches/pipefifo_merge2.4.diff

Sorry I haven;t got around to reviewing this.

>> doesn't introduce any further regressions while it fixes
>
>> - style bugs you pointed out.
>> - pipe_stat now use underlying filesystem information for pipes.

The last still can't work.  It is necessary to use the underlying
file system for all read and write metadata operations.  It is too
hard to emulate what the underlying file system would do and write
to it when the pipe state is released.

>> - comment about pipeinfo was intended for.
>> - race into pipe_poll (look at fifo_iseof line).
>
> I tested this version of the patch and found that it breaks opening a
> fifo with O_TRUNC: it fails with [EINVAL]. This appears to be
> pipe_truncate()'s doing. Previously, truncate requests went to the
> vnode.

This probably affects truncate() and ftruncate() too.  Most file systems
accept bogus truncation requests and do nothing, but pipe_truncate()
always returns EINVAL.

POSIX requires that O_TRUNC have no effect for fifos and terminals, and
of course to work if the file is regular.  Its effect for other file types
is implementation-defined.  The FreeBSD implementation doesn't actually
define (document) the behaviour anywhere AFAIK.  Its documentation in
open(2) is just broken, since says that O_TRUNC causes truncation to size
0 without mentioning any restrictions on the file type.

POSIX requires ftruncate() to fail if the fd refers to a directory, and
of course to work if the fd refers to a regular file, and to have certain
behaviour if the fd refers to a shared memory object.  Its effect for other
file types is unspecifed.

ftruncate is normally implemented in the file system's VOP_SETATTR()
(fo_truncate for vnodes is vn_truncate() which calls VOP_SETATTR()).
If you fix the attributes, then O_TRUNC will be fixed automatically.

devfs implements fo_truncate bogusly, using a lot of code to do almost
nothing.  It starts with a perfectly suitable fileop for this, namely
devfs_ops_f = devfs_truncate_f().  This just calls vop.fo_truncate =
vn_truncate(), which executes a lot of code that reduces to calling
VOP_SETATTR() = devfs_vnodeops.vop_setattr = devfs_settatr().
devfs_setattr() then silenly ignores the request to set the size.  But
this is wrong for directories.  However, an error is returned for
directories because vn_truncate() doesn't quite reduce to VOP_SETATTR().
It also checks for directories and returns EISDIR for them.  Perhaps
always going through the vn layer is a good idea after all.

ffs has the following complications in ufs_setattr() for truncation:
- truncation works on symlinks too.  This seems to be undocumented.
- but a read-only mount gives error EROFS
- a snapshot for the inode gives EPERM.  This seems to be undocumented.
- immutability or append-only for the inode gives EPERM.

BTW, most or all setattr() vops have a broken test for va_bytes ==
VNOVAL.  They break va_bytes in the test by casting it to int.  Thus
2**32-1 values for it with the low 32 bits set are misclassified as
VNOVAL (1 of such 2**32 values is correctly classified).  But this has
no effect, since va_bytes is always VNOVAL.  I think most or all the
other unsettable attributes that are laboriously checked for by setattr
vnops are also impossible.  It would take a kernel bug for a generic
unsettable attributes to be requested to be set.  A really large bug,
like requesting to change the file flags instead of the file mode.
Checking for the generic unsettable attributes just obscures the checks
for unsupported attributes that some file systems should be doing.

Bruce



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120129215808.V844>