From owner-freebsd-current@FreeBSD.ORG Wed Jan 25 16:29:27 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DDF8D1065677 for ; Wed, 25 Jan 2012 16:29:26 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 9C43E8FC13 for ; Wed, 25 Jan 2012 16:29:26 +0000 (UTC) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [96.47.65.170]) by cyrus.watson.org (Postfix) with ESMTPSA id 3623F46B3F; Wed, 25 Jan 2012 11:29:26 -0500 (EST) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 8FE4EB926; Wed, 25 Jan 2012 11:29:25 -0500 (EST) From: John Baldwin To: freebsd-current@freebsd.org Date: Wed, 25 Jan 2012 11:29:22 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p10; KDE/4.5.5; amd64; ; ) References: <201201191739.48327.tijl@coosemans.org> <201201201412.13269.jhb@freebsd.org> In-Reply-To: <201201201412.13269.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201201251129.22368.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 25 Jan 2012 11:29:25 -0500 (EST) Cc: Tijl Coosemans Subject: Re: posix_fadvise noreuse disables file caching X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Jan 2012 16:29:27 -0000 On Friday, January 20, 2012 2:12:13 pm John Baldwin wrote: > On Thursday, January 19, 2012 11:39:42 am Tijl Coosemans wrote: > > Hi, > > > > I recently noticed that multimedia/vlc generates a lot of disk IO when > > playing media files. For instance, when playing a 320kbps mp3 gstat > > reports about 1250kBps (=10000kbps). That's quite a lot of overhead. > > > > It turns out that vlc sets POSIX_FADV_NOREUSE on the entire file and > > reads in chunks of 1028 bytes. FreeBSD implements NOREUSE as if > > O_DIRECT was specified during open(2), i.e. it disables all caching. > > That means every 1028 byte read turns into a 32KiB read (new default > > block size in 9.0) which explains the above numbers. > > > > I've copied the relevant vlc code below (modules/access/file.c:Open()). > > It's interesting to see that on OSX it sets F_NOCACHE which disables > > caching too, but combined with F_RDAHEAD there's still read-ahead > > caching. > > > > I don't think POSIX intended for NOREUSE to mean O_DIRECT. It should > > still cache data (and even do read-ahead if F_RDAHEAD is specified), > > and once data is fetched from the cache, it can be marked WONTNEED. > > POSIX doesn't specify O_DIRECT, so it's not clear what it asks for. > > > Is it possible to implement it this way, or if not to just ignore > > the NOREUSE hint for now? > > I think it would be good to improve NOREUSE, though I had sort of > assumed that applications using NOREUSE would do their own buffering > and read full blocks. We could perhaps reimplement NOREUSE by doing > the equivalent of POSIX_FADV_DONTNEED after each read to free buffers > and pages after the data is copied out to userland. I also have an > XXX about whether or not NOREUSE should still allow read-ahead as it > isn't very clear what the right thing to do there is. HP-UX (IIRC) > has an fadvise() that lets you specify multiple policies, so you > could specify both NOREUSE and SEQUENTIAL for a single region to > get read-ahead but still release memory once the data is read once. So I've came up with this untested patch. It uses VOP_ADVISE(FADV_DONTNEED) after read(2) calls to a NOREUSE region, and leaves read-ahead caching enabled for NOREUSE. FADV_DONTNEED doesn't do any good really for writes (it only flushes clean buffers), so I've left write(2) operations as using IO_DIRECT still. Does this sound reasonable? I've not yet tested this at all: Index: vfs_vnops.c =================================================================== --- vfs_vnops.c (revision 230331) +++ vfs_vnops.c (working copy) @@ -519,6 +519,7 @@ vn_read(fp, uio, active_cred, flags, td) int error, ioflag; struct mtx *mtxp; int advice, vfslocked; + off_t offset; KASSERT(uio->uio_td == td, ("uio_td %p is not td %p", uio->uio_td, td)); @@ -558,19 +559,14 @@ vn_read(fp, uio, active_cred, flags, td) switch (advice) { case POSIX_FADV_NORMAL: case POSIX_FADV_SEQUENTIAL: + case POSIX_FADV_NOREUSE: ioflag |= sequential_heuristic(uio, fp); break; case POSIX_FADV_RANDOM: /* Disable read-ahead for random I/O. */ break; - case POSIX_FADV_NOREUSE: - /* - * Request the underlying FS to discard the buffers - * and pages after the I/O is complete. - */ - ioflag |= IO_DIRECT; - break; } + offset = uio->uio_offset; #ifdef MAC error = mac_vnode_check_read(active_cred, fp->f_cred, vp); @@ -587,6 +583,10 @@ vn_read(fp, uio, active_cred, flags, td) } fp->f_nextoff = uio->uio_offset; VOP_UNLOCK(vp, 0); + if (error == 0 && advice == POSIX_FADV_NOREUSE && + offset != uio->uio_offset) + error = VOP_ADVISE(vp, offset, uio->uio_offset - 1, + POSIX_FADV_DONTNEED); VFS_UNLOCK_GIANT(vfslocked); return (error); } -- John Baldwin