Date: Mon, 23 May 2016 17:31:19 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: freebsd-hackers@freebsd.org Subject: Re: read(2) and thus bsdiff is limited to 2^31 bytes Message-ID: <20160523143119.GV89104@kib.kiev.ua> In-Reply-To: <20160523133842.GA17056@britannica.bec.de> References: <b2515cae-b75d-66e9-4207-3cf100ab3ab0@erdgeist.org> <20160522225414.GB24398@britannica.bec.de> <154dab43060.11208cdfd132112.2616144627831899155@nextbsd.org> <20160522231203.GB25503@britannica.bec.de> <154db353935.dd5e87c1133922.4370692881788049491@nextbsd.org> <20160523122131.GC8747@britannica.bec.de> <5a607409-1b98-8944-b1f2-4422b1d28248@erdgeist.org> <20160523133842.GA17056@britannica.bec.de>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, May 23, 2016 at 03:38:42PM +0200, Joerg Sonnenberger wrote: > On Mon, May 23, 2016 at 02:36:58PM +0200, Dirk Engling wrote: > > On 23.05.16 14:21, Joerg Sonnenberger wrote: > > > > > Atomic meaning in this context that the read can be observed either > > > completely or not at all. This still doesn't mean that read must > > > execute the full size. Other cases for short read/writes are socket, > > > pipes etc. > > > > On linux I found read() returning a short read, however I wonder if any > > user land application developer ever expects a read from local file to > > yield a short read and continue reading. Maybe I should scan base system > > sources for all occurrences of read. > > They have to. Consider a signal interrupting the read. FreeBSD ensures, at least for some filesystems, that reads are atomic WRT writes, by your definition of atomic. Previously, it was (mostly) ensured by keeping exclusive vnode lock around VOP_WRITE, and shared vnode lock around VOP_READ. Then ZFS was changed to only keep shared lock on write, but supposedly there was an internal range locking, preventing reads from starting if write happens for the intersecting range. Then UFS was modified to sometimes split read/write requests into smaller VOP calls and drop vnode locks between them. This was done to prevent recursing info VM/VFS on page faults during uiomove(9) from VOPs. As a compensation, VFS-level rangelocks were introduced for UFS only. And then, quite recently, ZFS was changed to operate in the same chunked mode as UFS and, implicitely, the same VFS rangelocks are currently applied for each read and write requests on both UFS and ZFS. But none of the local filesystems allow signals to interrupt the operations. Pending signal never results in the short read or write neither on UFS nor on ZFS (and msdosfs too). It might be allowed for NFS by a mount option.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20160523143119.GV89104>