Date: Mon, 30 Jan 2012 15:12:30 -0500 From: John Baldwin <jhb@freebsd.org> To: src-committers@freebsd.org Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org Subject: Re: svn commit: r230782 - head/sys/kern Message-ID: <201201301512.30116.jhb@freebsd.org> In-Reply-To: <201201301935.q0UJZGW7099426@svn.freebsd.org> References: <201201301935.q0UJZGW7099426@svn.freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday, January 30, 2012 2:35:16 pm John Baldwin wrote: > Author: jhb > Date: Mon Jan 30 19:35:15 2012 > New Revision: 230782 > URL: http://svn.freebsd.org/changeset/base/230782 > > Log: > Refine the implementation of POSIX_FADV_NOREUSE for the read(2) case such > that instead of using direct I/O it allows read-ahead similar to > POSIX_FADV_NORMAL, but invokes VOP_ADVISE(POSIX_FADV_DONTNEED) after the > read(2) has completed to purge just-read data. The write(2) path continues > to use direct I/O for POSIX_FADV_NOREUSE for now. Note that NOREUSE works > optimally if an application reads and writes full fs blocks. Oops, forgot: Tested by: jilles The NOREUSE bits may still need further refinement. For example, if we allow something along the lines of 'POSIX_FADV_NOREUSE | POSIX_FADV_SEQUENTIAL', then we could change the VOP_ADVISE() here to use 0 as the starting offset which should do a better job of not leaving data in RAM due to reading partial blocks. Also, sequentially reading a file on unaligned block offsets with NOREUSE can result in extraneous reads currently, and we could possibly alleviate those by changing DONTNEED to only flush wholly contained-blocks rather than wholly-contained pages from the backing VM object. However, without the previous change I suggested that will exacerbate the problem of NOREUSE not actually purging any data from RAM. The problem with the | approach though is that it is not portable, so it is not likely that portable programs like vlc will use it. HP/UX had an extended variant of fadvise() that allowed multiple policies to be set on a range, apparently to handle exactly this case (sequential and noreuse). The problem seems to be that noreuse is really orthogonal to the other access-pattern hints (normal vs random vs sequential). Finally, I've wondered if POSIX_FADV_SEQUENTIAL shouldn't just mandate the maximum read-ahead and write-clustering rather than using the heuristics. It's not completely clear if we did that what the "right" thing to do if an application does posix_fadvise(POSIX_FADV_SEQUENTIAL) followed by fcntl(F_READAHEAD) with a different size, esp. given that posix_fadvise() can theoretically only apply to a range of the file descriptor whereas F_READAHEAD applies globally to the file descriptor. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201201301512.30116.jhb>