From owner-svn-src-all@freebsd.org Wed Dec 16 21:30:48 2015 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7FE9FA48330; Wed, 16 Dec 2015 21:30:48 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 42A5719DE; Wed, 16 Dec 2015 21:30:48 +0000 (UTC) (envelope-from glebius@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id tBGLUl9s083597; Wed, 16 Dec 2015 21:30:47 GMT (envelope-from glebius@FreeBSD.org) Received: (from glebius@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id tBGLUjPj083575; Wed, 16 Dec 2015 21:30:45 GMT (envelope-from glebius@FreeBSD.org) Message-Id: <201512162130.tBGLUjPj083575@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: glebius set sender to glebius@FreeBSD.org using -f From: Gleb Smirnoff Date: Wed, 16 Dec 2015 21:30:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r292373 - in head: share/man/man9 sys/cddl/contrib/opensolaris/uts/common/fs/zfs sys/dev/drm2/i915 sys/dev/drm2/ttm sys/dev/md sys/fs/fuse sys/fs/nfsclient sys/fs/smbfs sys/fs/tmpfs sys... X-SVN-Group: head MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2015 21:30:48 -0000 Author: glebius Date: Wed Dec 16 21:30:45 2015 New Revision: 292373 URL: https://svnweb.freebsd.org/changeset/base/292373 Log: A change to KPI of vm_pager_get_pages() and underlying VOP_GETPAGES(). o With new KPI consumers can request contiguous ranges of pages, and unlike before, all pages will be kept busied on return, like it was done before with the 'reqpage' only. Now the reqpage goes away. With new interface it is easier to implement code protected from race conditions. Such arrayed requests for now should be preceeded by a call to vm_pager_haspage() to make sure that request is possible. This could be improved later, making vm_pager_haspage() obsolete. Strenghtening the promises on the business of the array of pages allows us to remove such hacks as swp_pager_free_nrpage() and vm_pager_free_nonreq(). o New KPI accepts two integer pointers that may optionally point at values for read ahead and read behind, that a pager may do, if it can. These pages are completely owned by pager, and not controlled by the caller. This shifts the UFS-specific readahead logic from vm_fault.c, which should be file system agnostic, into vnode_pager.c. It also removes one VOP_BMAP() request per hard fault. Discussed with: kib, alc, jeff, scottl Sponsored by: Nginx, Inc. Sponsored by: Netflix Modified: head/share/man/man9/VOP_GETPAGES.9 head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c head/sys/dev/drm2/i915/i915_gem.c head/sys/dev/drm2/ttm/ttm_tt.c head/sys/dev/md/md.c head/sys/fs/fuse/fuse_vnops.c head/sys/fs/nfsclient/nfs_clbio.c head/sys/fs/smbfs/smbfs_io.c head/sys/fs/tmpfs/tmpfs_subr.c head/sys/kern/kern_exec.c head/sys/kern/uipc_shm.c head/sys/kern/uipc_syscalls.c head/sys/kern/vfs_default.c head/sys/kern/vnode_if.src head/sys/sys/buf.h head/sys/vm/default_pager.c head/sys/vm/device_pager.c head/sys/vm/phys_pager.c head/sys/vm/sg_pager.c head/sys/vm/swap_pager.c head/sys/vm/vm_fault.c head/sys/vm/vm_glue.c head/sys/vm/vm_object.c head/sys/vm/vm_object.h head/sys/vm/vm_page.c head/sys/vm/vm_pager.c head/sys/vm/vm_pager.h head/sys/vm/vnode_pager.c head/sys/vm/vnode_pager.h Modified: head/share/man/man9/VOP_GETPAGES.9 ============================================================================== --- head/share/man/man9/VOP_GETPAGES.9 Wed Dec 16 21:15:12 2015 (r292372) +++ head/share/man/man9/VOP_GETPAGES.9 Wed Dec 16 21:30:45 2015 (r292373) @@ -29,7 +29,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 12, 2014 +.Dd December 16, 2015 .Dt VOP_GETPAGES 9 .Os .Sh NAME @@ -41,7 +41,7 @@ .In sys/vnode.h .In vm/vm.h .Ft int -.Fn VOP_GETPAGES "struct vnode *vp" "vm_page_t *ma" "int count" "int reqpage" +.Fn VOP_GETPAGES "struct vnode *vp" "vm_page_t *ma" "int count" "int *rbehind" "int *rahead" .Ft int .Fn VOP_PUTPAGES "struct vnode *vp" "vm_page_t *ma" "int count" "int sync" "int *rtvals" .Sh DESCRIPTION @@ -63,7 +63,7 @@ locks are held. Both methods return in the same state on both success and error returns. .Pp The arguments are: -.Bl -tag -width reqpage +.Bl -tag -width rbehind .It Fa vp The file to access. .It Fa ma @@ -78,9 +78,16 @@ if the write should be synchronous. An array of VM system result codes indicating the status of each page written by .Fn VOP_PUTPAGES . -.It Fa reqpage -The index in the page array of the requested page; i.e., the one page which -the implementation of this method must handle. +.It Fa rbehind +Optional pointer to integer specifying number of pages to be read behind, if +possible. +If the filesystem supports that feature, number of actually read pages is +reported back, otherwise zero is returned. +.It Fa rahead +Optional pointer to integer specifying number of pages to be read ahead, if +possible. +If the filesystem supports that feature, number of actually read pages is +reported back, otherwise zero is returned. .El .Pp The status of the Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Wed Dec 16 21:30:45 2015 (r292373) @@ -5762,12 +5762,13 @@ ioflags(int ioflags) } static int -zfs_getpages(struct vnode *vp, vm_page_t *m, int count, int reqpage) +zfs_getpages(struct vnode *vp, vm_page_t *m, int count, int *rbehind, + int *rahead) { znode_t *zp = VTOZ(vp); zfsvfs_t *zfsvfs = zp->z_zfsvfs; objset_t *os = zp->z_zfsvfs->z_os; - vm_page_t mfirst, mlast, mreq; + vm_page_t mlast; vm_object_t object; caddr_t va; struct sf_buf *sf; @@ -5776,82 +5777,46 @@ zfs_getpages(struct vnode *vp, vm_page_t vm_pindex_t reqstart, reqend; int pcount, lsize, reqsize, size; + if (rbehind) + *rbehind = 0; + if (rahead) + *rahead = 0; + ZFS_ENTER(zfsvfs); ZFS_VERIFY_ZP(zp); pcount = OFF_TO_IDX(round_page(count)); - mreq = m[reqpage]; - object = mreq->object; - error = 0; - - if (pcount > 1 && zp->z_blksz > PAGESIZE) { - startoff = rounddown(IDX_TO_OFF(mreq->pindex), zp->z_blksz); - reqstart = OFF_TO_IDX(round_page(startoff)); - if (reqstart < m[0]->pindex) - reqstart = 0; - else - reqstart = reqstart - m[0]->pindex; - endoff = roundup(IDX_TO_OFF(mreq->pindex) + PAGE_SIZE, - zp->z_blksz); - reqend = OFF_TO_IDX(trunc_page(endoff)) - 1; - if (reqend > m[pcount - 1]->pindex) - reqend = m[pcount - 1]->pindex; - reqsize = reqend - m[reqstart]->pindex + 1; - KASSERT(reqstart <= reqpage && reqpage < reqstart + reqsize, - ("reqpage beyond [reqstart, reqstart + reqsize[ bounds")); - } else { - reqstart = reqpage; - reqsize = 1; - } - mfirst = m[reqstart]; - mlast = m[reqstart + reqsize - 1]; zfs_vmobject_wlock(object); - - for (i = 0; i < reqstart; i++) { - vm_page_lock(m[i]); - vm_page_free(m[i]); - vm_page_unlock(m[i]); - } - for (i = reqstart + reqsize; i < pcount; i++) { - vm_page_lock(m[i]); - vm_page_free(m[i]); - vm_page_unlock(m[i]); - } - - if (mreq->valid && reqsize == 1) { - if (mreq->valid != VM_PAGE_BITS_ALL) - vm_page_zero_invalid(mreq, TRUE); + if (m[pcount - 1]->valid != 0 && --pcount == 0) { zfs_vmobject_wunlock(object); ZFS_EXIT(zfsvfs); return (zfs_vm_pagerret_ok); } - PCPU_INC(cnt.v_vnodein); - PCPU_ADD(cnt.v_vnodepgsin, reqsize); + object = m[0]->object; + mlast = m[pcount - 1]; - if (IDX_TO_OFF(mreq->pindex) >= object->un_pager.vnp.vnp_size) { - for (i = reqstart; i < reqstart + reqsize; i++) { - if (i != reqpage) { - vm_page_lock(m[i]); - vm_page_free(m[i]); - vm_page_unlock(m[i]); - } - } + if (IDX_TO_OFF(mlast->pindex) >= + object->un_pager.vnp.vnp_size) { zfs_vmobject_wunlock(object); ZFS_EXIT(zfsvfs); return (zfs_vm_pagerret_bad); } + PCPU_INC(cnt.v_vnodein); + PCPU_ADD(cnt.v_vnodepgsin, reqsize); + lsize = PAGE_SIZE; if (IDX_TO_OFF(mlast->pindex) + lsize > object->un_pager.vnp.vnp_size) - lsize = object->un_pager.vnp.vnp_size - IDX_TO_OFF(mlast->pindex); - + lsize = object->un_pager.vnp.vnp_size - + IDX_TO_OFF(mlast->pindex); zfs_vmobject_wunlock(object); - for (i = reqstart; i < reqstart + reqsize; i++) { + error = 0; + for (i = 0; i < pcount; i++) { size = PAGE_SIZE; - if (i == (reqstart + reqsize - 1)) + if (i == pcount - 1) size = lsize; va = zfs_map_page(m[i], &sf); error = dmu_read(os, zp->z_id, IDX_TO_OFF(m[i]->pindex), @@ -5860,21 +5825,15 @@ zfs_getpages(struct vnode *vp, vm_page_t bzero(va + size, PAGE_SIZE - size); zfs_unmap_page(sf); if (error != 0) - break; + goto out; } zfs_vmobject_wlock(object); - - for (i = reqstart; i < reqstart + reqsize; i++) { - if (!error) - m[i]->valid = VM_PAGE_BITS_ALL; - KASSERT(m[i]->dirty == 0, ("zfs_getpages: page %p is dirty", m[i])); - if (i != reqpage) - vm_page_readahead_finish(m[i]); - } - + for (i = 0; i < pcount; i++) + m[i]->valid = VM_PAGE_BITS_ALL; zfs_vmobject_wunlock(object); +out: ZFS_ACCESSTIME_STAMP(zfsvfs, zp); ZFS_EXIT(zfsvfs); return (error ? zfs_vm_pagerret_error : zfs_vm_pagerret_ok); @@ -5886,11 +5845,13 @@ zfs_freebsd_getpages(ap) struct vnode *a_vp; vm_page_t *a_m; int a_count; - int a_reqpage; + int *a_rbehind; + int *a_rahead; } */ *ap; { - return (zfs_getpages(ap->a_vp, ap->a_m, ap->a_count, ap->a_reqpage)); + return (zfs_getpages(ap->a_vp, ap->a_m, ap->a_count, ap->a_rbehind, + ap->a_rahead)); } static int Modified: head/sys/dev/drm2/i915/i915_gem.c ============================================================================== --- head/sys/dev/drm2/i915/i915_gem.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/dev/drm2/i915/i915_gem.c Wed Dec 16 21:30:45 2015 (r292373) @@ -4338,7 +4338,7 @@ i915_gem_wire_page(vm_object_t object, v page = vm_page_grab(object, pindex, VM_ALLOC_NORMAL); if (page->valid != VM_PAGE_BITS_ALL) { if (vm_pager_has_page(object, pindex, NULL, NULL)) { - rv = vm_pager_get_pages(object, &page, 1, 0); + rv = vm_pager_get_pages(object, &page, 1, NULL, NULL); if (rv != VM_PAGER_OK) { vm_page_lock(page); vm_page_free(page); Modified: head/sys/dev/drm2/ttm/ttm_tt.c ============================================================================== --- head/sys/dev/drm2/ttm/ttm_tt.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/dev/drm2/ttm/ttm_tt.c Wed Dec 16 21:30:45 2015 (r292373) @@ -291,7 +291,8 @@ int ttm_tt_swapin(struct ttm_tt *ttm) from_page = vm_page_grab(obj, i, VM_ALLOC_NORMAL); if (from_page->valid != VM_PAGE_BITS_ALL) { if (vm_pager_has_page(obj, i, NULL, NULL)) { - rv = vm_pager_get_pages(obj, &from_page, 1, 0); + rv = vm_pager_get_pages(obj, &from_page, 1, + NULL, NULL); if (rv != VM_PAGER_OK) { vm_page_lock(from_page); vm_page_free(from_page); Modified: head/sys/dev/md/md.c ============================================================================== --- head/sys/dev/md/md.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/dev/md/md.c Wed Dec 16 21:30:45 2015 (r292373) @@ -1019,7 +1019,8 @@ mdstart_swap(struct md_s *sc, struct bio if (m->valid == VM_PAGE_BITS_ALL) rv = VM_PAGER_OK; else - rv = vm_pager_get_pages(sc->object, &m, 1, 0); + rv = vm_pager_get_pages(sc->object, &m, 1, + NULL, NULL); if (rv == VM_PAGER_ERROR) { vm_page_xunbusy(m); break; @@ -1046,7 +1047,8 @@ mdstart_swap(struct md_s *sc, struct bio } } else if (bp->bio_cmd == BIO_WRITE) { if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL) - rv = vm_pager_get_pages(sc->object, &m, 1, 0); + rv = vm_pager_get_pages(sc->object, &m, 1, + NULL, NULL); else rv = VM_PAGER_OK; if (rv == VM_PAGER_ERROR) { @@ -1065,7 +1067,8 @@ mdstart_swap(struct md_s *sc, struct bio m->valid = VM_PAGE_BITS_ALL; } else if (bp->bio_cmd == BIO_DELETE) { if (len != PAGE_SIZE && m->valid != VM_PAGE_BITS_ALL) - rv = vm_pager_get_pages(sc->object, &m, 1, 0); + rv = vm_pager_get_pages(sc->object, &m, 1, + NULL, NULL); else rv = VM_PAGER_OK; if (rv == VM_PAGER_ERROR) { Modified: head/sys/fs/fuse/fuse_vnops.c ============================================================================== --- head/sys/fs/fuse/fuse_vnops.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/fs/fuse/fuse_vnops.c Wed Dec 16 21:30:45 2015 (r292373) @@ -1753,6 +1753,10 @@ fuse_vnop_getpages(struct vop_getpages_a cred = curthread->td_ucred; /* XXX */ pages = ap->a_m; count = ap->a_count; + if (ap->a_rbehind) + *ap->a_rbehind = 0; + if (ap->a_rahead) + *ap->a_rahead = 0; if (!fsess_opt_mmap(vnode_mount(vp))) { FS_DEBUG("called on non-cacheable vnode??\n"); @@ -1761,26 +1765,21 @@ fuse_vnop_getpages(struct vop_getpages_a npages = btoc(count); /* - * If the requested page is partially valid, just return it and - * allow the pager to zero-out the blanks. Partially valid pages - * can only occur at the file EOF. + * If the last page is partially valid, just return it and allow + * the pager to zero-out the blanks. Partially valid pages can + * only occur at the file EOF. + * + * XXXGL: is that true for FUSE, which is a local filesystem, + * but still somewhat disconnected from the kernel? */ - VM_OBJECT_WLOCK(vp->v_object); - fuse_vm_page_lock_queues(); - if (pages[ap->a_reqpage]->valid != 0) { - for (i = 0; i < npages; ++i) { - if (i != ap->a_reqpage) { - fuse_vm_page_lock(pages[i]); - vm_page_free(pages[i]); - fuse_vm_page_unlock(pages[i]); - } + if (pages[npages - 1]->valid != 0) { + if (--npages == 0) { + VM_OBJECT_WUNLOCK(vp->v_object); + return (VM_PAGER_OK); } - fuse_vm_page_unlock_queues(); - VM_OBJECT_WUNLOCK(vp->v_object); - return 0; - } - fuse_vm_page_unlock_queues(); + count = npages << PAGE_SHIFT; + } VM_OBJECT_WUNLOCK(vp->v_object); /* @@ -1811,17 +1810,6 @@ fuse_vnop_getpages(struct vop_getpages_a if (error && (uio.uio_resid == count)) { FS_DEBUG("error %d\n", error); - VM_OBJECT_WLOCK(vp->v_object); - fuse_vm_page_lock_queues(); - for (i = 0; i < npages; ++i) { - if (i != ap->a_reqpage) { - fuse_vm_page_lock(pages[i]); - vm_page_free(pages[i]); - fuse_vm_page_unlock(pages[i]); - } - } - fuse_vm_page_unlock_queues(); - VM_OBJECT_WUNLOCK(vp->v_object); return VM_PAGER_ERROR; } /* @@ -1862,8 +1850,6 @@ fuse_vnop_getpages(struct vop_getpages_a */ ; } - if (i != ap->a_reqpage) - vm_page_readahead_finish(m); } fuse_vm_page_unlock_queues(); VM_OBJECT_WUNLOCK(vp->v_object); Modified: head/sys/fs/nfsclient/nfs_clbio.c ============================================================================== --- head/sys/fs/nfsclient/nfs_clbio.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/fs/nfsclient/nfs_clbio.c Wed Dec 16 21:30:45 2015 (r292373) @@ -101,6 +101,10 @@ ncl_getpages(struct vop_getpages_args *a nmp = VFSTONFS(vp->v_mount); pages = ap->a_m; count = ap->a_count; + if (ap->a_rbehind) + *ap->a_rbehind = 0; + if (ap->a_rahead) + *ap->a_rahead = 0; if ((object = vp->v_object) == NULL) { ncl_printf("nfs_getpages: called with non-merged cache vnode??\n"); @@ -132,12 +136,18 @@ ncl_getpages(struct vop_getpages_args *a * If the requested page is partially valid, just return it and * allow the pager to zero-out the blanks. Partially valid pages * can only occur at the file EOF. + * + * XXXGL: is that true for NFS, where short read can occur??? */ - if (pages[ap->a_reqpage]->valid != 0) { - vm_pager_free_nonreq(object, pages, ap->a_reqpage, npages, - FALSE); - return (VM_PAGER_OK); + VM_OBJECT_WLOCK(object); + if (pages[npages - 1]->valid != 0) { + if (--npages == 0) { + VM_OBJECT_WUNLOCK(object); + return (VM_PAGER_OK); + } + count = npages << PAGE_SHIFT; } + VM_OBJECT_WUNLOCK(object); /* * We use only the kva address for the buffer, but this is extremely @@ -167,8 +177,6 @@ ncl_getpages(struct vop_getpages_args *a if (error && (uio.uio_resid == count)) { ncl_printf("nfs_getpages: error %d\n", error); - vm_pager_free_nonreq(object, pages, ap->a_reqpage, npages, - FALSE); return (VM_PAGER_ERROR); } @@ -212,8 +220,6 @@ ncl_getpages(struct vop_getpages_args *a */ ; } - if (i != ap->a_reqpage) - vm_page_readahead_finish(m); } VM_OBJECT_WUNLOCK(object); return (0); Modified: head/sys/fs/smbfs/smbfs_io.c ============================================================================== --- head/sys/fs/smbfs/smbfs_io.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/fs/smbfs/smbfs_io.c Wed Dec 16 21:30:45 2015 (r292373) @@ -424,7 +424,7 @@ smbfs_getpages(ap) #ifdef SMBFS_RWGENERIC return vop_stdgetpages(ap); #else - int i, error, nextoff, size, toff, npages, count, reqpage; + int i, error, nextoff, size, toff, npages, count; struct uio uio; struct iovec iov; vm_offset_t kva; @@ -436,7 +436,7 @@ smbfs_getpages(ap) struct smbnode *np; struct smb_cred *scred; vm_object_t object; - vm_page_t *pages, m; + vm_page_t *pages; vp = ap->a_vp; if ((object = vp->v_object) == NULL) { @@ -451,26 +451,25 @@ smbfs_getpages(ap) pages = ap->a_m; count = ap->a_count; npages = btoc(count); - reqpage = ap->a_reqpage; + if (ap->a_rbehind) + *ap->a_rbehind = 0; + if (ap->a_rahead) + *ap->a_rahead = 0; /* * If the requested page is partially valid, just return it and * allow the pager to zero-out the blanks. Partially valid pages * can only occur at the file EOF. + * + * XXXGL: is that true for SMB filesystem? */ - m = pages[reqpage]; - VM_OBJECT_WLOCK(object); - if (m->valid != 0) { - for (i = 0; i < npages; ++i) { - if (i != reqpage) { - vm_page_lock(pages[i]); - vm_page_free(pages[i]); - vm_page_unlock(pages[i]); - } + if (pages[npages - 1]->valid != 0) { + if (--npages == 0) { + VM_OBJECT_WUNLOCK(object); + return (VM_PAGER_OK); } - VM_OBJECT_WUNLOCK(object); - return 0; + count = npages << PAGE_SHIFT; } VM_OBJECT_WUNLOCK(object); @@ -500,22 +499,14 @@ smbfs_getpages(ap) relpbuf(bp, &smbfs_pbuf_freecnt); - VM_OBJECT_WLOCK(object); if (error && (uio.uio_resid == count)) { printf("smbfs_getpages: error %d\n",error); - for (i = 0; i < npages; i++) { - if (reqpage != i) { - vm_page_lock(pages[i]); - vm_page_free(pages[i]); - vm_page_unlock(pages[i]); - } - } - VM_OBJECT_WUNLOCK(object); return VM_PAGER_ERROR; } size = count - uio.uio_resid; + VM_OBJECT_WLOCK(object); for (i = 0, toff = 0; i < npages; i++, toff = nextoff) { vm_page_t m; nextoff = toff + PAGE_SIZE; @@ -544,9 +535,6 @@ smbfs_getpages(ap) */ ; } - - if (i != reqpage) - vm_page_readahead_finish(m); } VM_OBJECT_WUNLOCK(object); return 0; Modified: head/sys/fs/tmpfs/tmpfs_subr.c ============================================================================== --- head/sys/fs/tmpfs/tmpfs_subr.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/fs/tmpfs/tmpfs_subr.c Wed Dec 16 21:30:45 2015 (r292373) @@ -1370,7 +1370,8 @@ retry: VM_OBJECT_WLOCK(uobj); goto retry; } else if (m->valid != VM_PAGE_BITS_ALL) - rv = vm_pager_get_pages(uobj, &m, 1, 0); + rv = vm_pager_get_pages(uobj, &m, 1, + NULL, NULL); else /* A cached page was reactivated. */ rv = VM_PAGER_OK; Modified: head/sys/kern/kern_exec.c ============================================================================== --- head/sys/kern/kern_exec.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/kern/kern_exec.c Wed Dec 16 21:30:45 2015 (r292373) @@ -950,8 +950,7 @@ int exec_map_first_page(imgp) struct image_params *imgp; { - int rv, i; - int initial_pagein; + int rv, i, after, initial_pagein; vm_page_t ma[VM_INITIAL_PAGEIN]; vm_object_t object; @@ -967,9 +966,18 @@ exec_map_first_page(imgp) #endif ma[0] = vm_page_grab(object, 0, VM_ALLOC_NORMAL); if (ma[0]->valid != VM_PAGE_BITS_ALL) { - initial_pagein = VM_INITIAL_PAGEIN; - if (initial_pagein > object->size) - initial_pagein = object->size; + if (!vm_pager_has_page(object, 0, NULL, &after)) { + vm_page_lock(ma[0]); + vm_page_free(ma[0]); + vm_page_unlock(ma[0]); + vm_page_xunbusy(ma[0]); + VM_OBJECT_WUNLOCK(object); + return (EIO); + } + initial_pagein = min(after, VM_INITIAL_PAGEIN); + KASSERT(initial_pagein <= object->size, + ("%s: initial_pagein %d object->size %ju", + __func__, initial_pagein, (uintmax_t )object->size)); for (i = 1; i < initial_pagein; i++) { if ((ma[i] = vm_page_next(ma[i - 1])) != NULL) { if (ma[i]->valid) @@ -984,14 +992,19 @@ exec_map_first_page(imgp) } } initial_pagein = i; - rv = vm_pager_get_pages(object, ma, initial_pagein, 0); + rv = vm_pager_get_pages(object, ma, initial_pagein, NULL, NULL); if (rv != VM_PAGER_OK) { - vm_page_lock(ma[0]); - vm_page_free(ma[0]); - vm_page_unlock(ma[0]); + for (i = 0; i < initial_pagein; i++) { + vm_page_lock(ma[i]); + vm_page_free(ma[i]); + vm_page_unlock(ma[i]); + vm_page_xunbusy(ma[i]); + } VM_OBJECT_WUNLOCK(object); return (EIO); } + for (i = 1; i < initial_pagein; i++) + vm_page_readahead_finish(ma[i]); } vm_page_xunbusy(ma[0]); vm_page_lock(ma[0]); Modified: head/sys/kern/uipc_shm.c ============================================================================== --- head/sys/kern/uipc_shm.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/kern/uipc_shm.c Wed Dec 16 21:30:45 2015 (r292373) @@ -189,7 +189,7 @@ uiomove_object_page(vm_object_t obj, siz m = vm_page_grab(obj, idx, VM_ALLOC_NORMAL); if (m->valid != VM_PAGE_BITS_ALL) { if (vm_pager_has_page(obj, idx, NULL, NULL)) { - rv = vm_pager_get_pages(obj, &m, 1, 0); + rv = vm_pager_get_pages(obj, &m, 1, NULL, NULL); if (rv != VM_PAGER_OK) { printf( "uiomove_object: vm_obj %p idx %jd valid %x pager error %d\n", @@ -460,7 +460,7 @@ retry: goto retry; } else if (m->valid != VM_PAGE_BITS_ALL) rv = vm_pager_get_pages(object, &m, 1, - 0); + NULL, NULL); else /* A cached page was reactivated. */ rv = VM_PAGER_OK; Modified: head/sys/kern/uipc_syscalls.c ============================================================================== --- head/sys/kern/uipc_syscalls.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/kern/uipc_syscalls.c Wed Dec 16 21:30:45 2015 (r292373) @@ -2033,7 +2033,7 @@ sendfile_readpage(vm_object_t obj, struc VM_OBJECT_WLOCK(obj); } else { if (vm_pager_has_page(obj, pindex, NULL, NULL)) { - rv = vm_pager_get_pages(obj, &m, 1, 0); + rv = vm_pager_get_pages(obj, &m, 1, NULL, NULL); SFSTAT_INC(sf_iocnt); if (rv != VM_PAGER_OK) { vm_page_lock(m); Modified: head/sys/kern/vfs_default.c ============================================================================== --- head/sys/kern/vfs_default.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/kern/vfs_default.c Wed Dec 16 21:30:45 2015 (r292373) @@ -731,12 +731,13 @@ vop_stdgetpages(ap) struct vnode *a_vp; vm_page_t *a_m; int a_count; - int a_reqpage; + int *a_rbehind; + int *a_rahead; } */ *ap; { return vnode_pager_generic_getpages(ap->a_vp, ap->a_m, - ap->a_count, ap->a_reqpage, NULL, NULL); + ap->a_count, ap->a_rbehind, ap->a_rahead, NULL, NULL); } static int @@ -744,8 +745,9 @@ vop_stdgetpages_async(struct vop_getpage { int error; - error = VOP_GETPAGES(ap->a_vp, ap->a_m, ap->a_count, ap->a_reqpage); - ap->a_iodone(ap->a_arg, ap->a_m, ap->a_reqpage, error); + error = VOP_GETPAGES(ap->a_vp, ap->a_m, ap->a_count, ap->a_rbehind, + ap->a_rahead); + ap->a_iodone(ap->a_arg, ap->a_m, ap->a_count, error); return (error); } Modified: head/sys/kern/vnode_if.src ============================================================================== --- head/sys/kern/vnode_if.src Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/kern/vnode_if.src Wed Dec 16 21:30:45 2015 (r292373) @@ -473,7 +473,8 @@ vop_getpages { IN struct vnode *vp; IN vm_page_t *m; IN int count; - IN int reqpage; + IN int *rbehind; + IN int *rahead; }; @@ -483,7 +484,8 @@ vop_getpages_async { IN struct vnode *vp; IN vm_page_t *m; IN int count; - IN int reqpage; + IN int *rbehind; + IN int *rahead; IN vop_getpages_iodone_t *iodone; IN void *arg; }; Modified: head/sys/sys/buf.h ============================================================================== --- head/sys/sys/buf.h Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/sys/buf.h Wed Dec 16 21:30:45 2015 (r292373) @@ -122,14 +122,13 @@ struct buf { struct ucred *b_rcred; /* Read credentials reference. */ struct ucred *b_wcred; /* Write credentials reference. */ union { - TAILQ_ENTRY(buf) bu_freelist; /* (Q) */ + TAILQ_ENTRY(buf) b_freelist; /* (Q) */ struct { - void (*pg_iodone)(void *, vm_page_t *, int, int); - int pg_reqpage; - } bu_pager; - } b_union; -#define b_freelist b_union.bu_freelist -#define b_pager b_union.bu_pager + void (*b_pgiodone)(void *, vm_page_t *, int, int); + int b_pgbefore; + int b_pgafter; + }; + }; union cluster_info { TAILQ_HEAD(cluster_list_head, buf) cluster_head; TAILQ_ENTRY(buf) cluster_entry; Modified: head/sys/vm/default_pager.c ============================================================================== --- head/sys/vm/default_pager.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/vm/default_pager.c Wed Dec 16 21:30:45 2015 (r292373) @@ -56,7 +56,7 @@ __FBSDID("$FreeBSD$"); static vm_object_t default_pager_alloc(void *, vm_ooffset_t, vm_prot_t, vm_ooffset_t, struct ucred *); static void default_pager_dealloc(vm_object_t); -static int default_pager_getpages(vm_object_t, vm_page_t *, int, int); +static int default_pager_getpages(vm_object_t, vm_page_t *, int, int *, int *); static void default_pager_putpages(vm_object_t, vm_page_t *, int, boolean_t, int *); static boolean_t default_pager_haspage(vm_object_t, vm_pindex_t, int *, @@ -122,13 +122,11 @@ default_pager_dealloc(object) * see a vm_page with assigned swap here. */ static int -default_pager_getpages(object, m, count, reqpage) - vm_object_t object; - vm_page_t *m; - int count; - int reqpage; +default_pager_getpages(vm_object_t object, vm_page_t *m, int count, + int *rbehind, int *rahead) { - return VM_PAGER_FAIL; + + return (VM_PAGER_FAIL); } /* Modified: head/sys/vm/device_pager.c ============================================================================== --- head/sys/vm/device_pager.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/vm/device_pager.c Wed Dec 16 21:30:45 2015 (r292373) @@ -59,7 +59,7 @@ static void dev_pager_init(void); static vm_object_t dev_pager_alloc(void *, vm_ooffset_t, vm_prot_t, vm_ooffset_t, struct ucred *); static void dev_pager_dealloc(vm_object_t); -static int dev_pager_getpages(vm_object_t, vm_page_t *, int, int); +static int dev_pager_getpages(vm_object_t, vm_page_t *, int, int *, int *); static void dev_pager_putpages(vm_object_t, vm_page_t *, int, int, int *); static boolean_t dev_pager_haspage(vm_object_t, vm_pindex_t, int *, int *); static void dev_pager_free_page(vm_object_t object, vm_page_t m); @@ -257,28 +257,33 @@ dev_pager_dealloc(vm_object_t object) } static int -dev_pager_getpages(vm_object_t object, vm_page_t *ma, int count, int reqpage) +dev_pager_getpages(vm_object_t object, vm_page_t *ma, int count, int *rbehind, + int *rahead) { int error; + /* Since our haspage reports zero after/before, the count is 1. */ + KASSERT(count == 1, ("%s: count %d", __func__, count)); VM_OBJECT_ASSERT_WLOCKED(object); error = object->un_pager.devp.ops->cdev_pg_fault(object, - IDX_TO_OFF(ma[reqpage]->pindex), PROT_READ, &ma[reqpage]); + IDX_TO_OFF(ma[0]->pindex), PROT_READ, &ma[0]); VM_OBJECT_ASSERT_WLOCKED(object); - vm_pager_free_nonreq(object, ma, reqpage, count, TRUE); - if (error == VM_PAGER_OK) { KASSERT((object->type == OBJT_DEVICE && - (ma[reqpage]->oflags & VPO_UNMANAGED) != 0) || + (ma[0]->oflags & VPO_UNMANAGED) != 0) || (object->type == OBJT_MGTDEVICE && - (ma[reqpage]->oflags & VPO_UNMANAGED) == 0), - ("Wrong page type %p %p", ma[reqpage], object)); + (ma[0]->oflags & VPO_UNMANAGED) == 0), + ("Wrong page type %p %p", ma[0], object)); if (object->type == OBJT_DEVICE) { TAILQ_INSERT_TAIL(&object->un_pager.devp.devp_pglist, - ma[reqpage], plinks.q); + ma[0], plinks.q); } + if (rbehind) + *rbehind = 0; + if (rahead) + *rahead = 0; } return (error); Modified: head/sys/vm/phys_pager.c ============================================================================== --- head/sys/vm/phys_pager.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/vm/phys_pager.c Wed Dec 16 21:30:45 2015 (r292373) @@ -139,7 +139,8 @@ phys_pager_dealloc(vm_object_t object) * Fill as many pages as vm_fault has allocated for us. */ static int -phys_pager_getpages(vm_object_t object, vm_page_t *m, int count, int reqpage) +phys_pager_getpages(vm_object_t object, vm_page_t *m, int count, int *rbehind, + int *rahead) { int i; @@ -154,14 +155,11 @@ phys_pager_getpages(vm_object_t object, ("phys_pager_getpages: partially valid page %p", m[i])); KASSERT(m[i]->dirty == 0, ("phys_pager_getpages: dirty page %p", m[i])); - /* The requested page must remain busy, the others not. */ - if (i == reqpage) { - vm_page_lock(m[i]); - vm_page_flash(m[i]); - vm_page_unlock(m[i]); - } else - vm_page_xunbusy(m[i]); } + if (rbehind) + *rbehind = 0; + if (rahead) + *rahead = 0; return (VM_PAGER_OK); } Modified: head/sys/vm/sg_pager.c ============================================================================== --- head/sys/vm/sg_pager.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/vm/sg_pager.c Wed Dec 16 21:30:45 2015 (r292373) @@ -49,7 +49,7 @@ __FBSDID("$FreeBSD$"); static vm_object_t sg_pager_alloc(void *, vm_ooffset_t, vm_prot_t, vm_ooffset_t, struct ucred *); static void sg_pager_dealloc(vm_object_t); -static int sg_pager_getpages(vm_object_t, vm_page_t *, int, int); +static int sg_pager_getpages(vm_object_t, vm_page_t *, int, int *, int *); static void sg_pager_putpages(vm_object_t, vm_page_t *, int, boolean_t, int *); static boolean_t sg_pager_haspage(vm_object_t, vm_pindex_t, int *, @@ -135,7 +135,8 @@ sg_pager_dealloc(vm_object_t object) } static int -sg_pager_getpages(vm_object_t object, vm_page_t *m, int count, int reqpage) +sg_pager_getpages(vm_object_t object, vm_page_t *m, int count, int *rbehind, + int *rahead) { struct sglist *sg; vm_page_t m_paddr, page; @@ -145,11 +146,13 @@ sg_pager_getpages(vm_object_t object, vm size_t space; int i; + /* Since our haspage reports zero after/before, the count is 1. */ + KASSERT(count == 1, ("%s: count %d", __func__, count)); VM_OBJECT_ASSERT_WLOCKED(object); sg = object->handle; memattr = object->memattr; VM_OBJECT_WUNLOCK(object); - offset = m[reqpage]->pindex; + offset = m[0]->pindex; /* * Lookup the physical address of the requested page. An initial @@ -178,26 +181,23 @@ sg_pager_getpages(vm_object_t object, vm } /* Return a fake page for the requested page. */ - KASSERT(!(m[reqpage]->flags & PG_FICTITIOUS), + KASSERT(!(m[0]->flags & PG_FICTITIOUS), ("backing page for SG is fake")); /* Construct a new fake page. */ page = vm_page_getfake(paddr, memattr); VM_OBJECT_WLOCK(object); TAILQ_INSERT_TAIL(&object->un_pager.sgp.sgp_pglist, page, plinks.q); - - /* Free the original pages and insert this fake page into the object. */ - for (i = 0; i < count; i++) { - if (i == reqpage && - vm_page_replace(page, object, offset) != m[i]) - panic("sg_pager_getpages: invalid place replacement"); - vm_page_lock(m[i]); - vm_page_free(m[i]); - vm_page_unlock(m[i]); - } - m[reqpage] = page; + if (vm_page_replace(page, object, offset) != m[0]) + panic("sg_pager_getpages: invalid place replacement"); + m[0] = page; page->valid = VM_PAGE_BITS_ALL; + if (rbehind) + *rbehind = 0; + if (rahead) + *rahead = 0; + return (VM_PAGER_OK); } Modified: head/sys/vm/swap_pager.c ============================================================================== --- head/sys/vm/swap_pager.c Wed Dec 16 21:15:12 2015 (r292372) +++ head/sys/vm/swap_pager.c Wed Dec 16 21:30:45 2015 (r292373) @@ -357,9 +357,10 @@ static vm_object_t swap_pager_alloc(void *handle, vm_ooffset_t size, vm_prot_t prot, vm_ooffset_t offset, struct ucred *); static void swap_pager_dealloc(vm_object_t object); -static int swap_pager_getpages(vm_object_t, vm_page_t *, int, int); -static int swap_pager_getpages_async(vm_object_t, vm_page_t *, int, int, - pgo_getpages_iodone_t, void *); +static int swap_pager_getpages(vm_object_t, vm_page_t *, int, int *, + int *); +static int swap_pager_getpages_async(vm_object_t, vm_page_t *, int, int *, + int *, pgo_getpages_iodone_t, void *); static void swap_pager_putpages(vm_object_t, vm_page_t *, int, boolean_t, int *); static boolean_t swap_pager_haspage(vm_object_t object, vm_pindex_t pindex, int *before, int *after); @@ -413,16 +414,6 @@ static void swp_pager_meta_free(vm_objec static void swp_pager_meta_free_all(vm_object_t); static daddr_t swp_pager_meta_ctl(vm_object_t, vm_pindex_t, int); -static void -swp_pager_free_nrpage(vm_page_t m) -{ - - vm_page_lock(m); - if (m->wire_count == 0) - vm_page_free(m); - vm_page_unlock(m); -} - /* * SWP_SIZECHECK() - update swap_pager_full indication * @@ -1103,16 +1094,12 @@ swap_pager_unswapped(vm_page_t m) * left busy, but the others adjusted. */ static int -swap_pager_getpages(vm_object_t object, vm_page_t *m, int count, int reqpage) +swap_pager_getpages(vm_object_t object, vm_page_t *m, int count, int *rbehind, + int *rahead) { struct buf *bp; - vm_page_t mreq; - int i; - int j; daddr_t blk; - mreq = m[reqpage]; - /* * Calculate range to retrieve. The pages have already been assigned * their swapblks. We require a *contiguous* range but we know it to @@ -1122,45 +1109,18 @@ swap_pager_getpages(vm_object_t object, * * The swp_*() calls must be made with the object locked. */ - blk = swp_pager_meta_ctl(mreq->object, mreq->pindex, 0); + blk = swp_pager_meta_ctl(m[0]->object, m[0]->pindex, 0); - for (i = reqpage - 1; i >= 0; --i) { - daddr_t iblk; - - iblk = swp_pager_meta_ctl(m[i]->object, m[i]->pindex, 0); - if (blk != iblk + (reqpage - i)) - break; - } - ++i; - - for (j = reqpage + 1; j < count; ++j) { - daddr_t jblk; - - jblk = swp_pager_meta_ctl(m[j]->object, m[j]->pindex, 0); - if (blk != jblk - (j - reqpage)) - break; - } - - /* - * free pages outside our collection range. Note: we never free - * mreq, it must remain busy throughout. - */ - if (0 < i || j < count) { - int k; - - for (k = 0; k < i; ++k) - swp_pager_free_nrpage(m[k]); - for (k = j; k < count; ++k) - swp_pager_free_nrpage(m[k]); - } - - /* - * Return VM_PAGER_FAIL if we have nothing to do. Return mreq - * still busy, but the others unbusied. - */ if (blk == SWAPBLK_NONE) return (VM_PAGER_FAIL); +#ifdef INVARIANTS + for (int i = 0; i < count; i++) + KASSERT(blk + i == + swp_pager_meta_ctl(m[i]->object, m[i]->pindex, 0), + ("%s: range is not contiguous", __func__)); +#endif + /* * Getpbuf() can sleep. */ @@ -1175,21 +1135,16 @@ swap_pager_getpages(vm_object_t object, bp->b_iodone = swp_pager_async_iodone; bp->b_rcred = crhold(thread0.td_ucred); bp->b_wcred = crhold(thread0.td_ucred); - bp->b_blkno = blk - (reqpage - i); - bp->b_bcount = PAGE_SIZE * (j - i); - bp->b_bufsize = PAGE_SIZE * (j - i); - bp->b_pager.pg_reqpage = reqpage - i; + bp->b_blkno = blk; + bp->b_bcount = PAGE_SIZE * count; + bp->b_bufsize = PAGE_SIZE * count; + bp->b_npages = count; VM_OBJECT_WLOCK(object); - { - int k; - - for (k = i; k < j; ++k) { - bp->b_pages[k - i] = m[k]; - m[k]->oflags |= VPO_SWAPINPROG; - } + for (int i = 0; i < count; i++) { + bp->b_pages[i] = m[i]; + m[i]->oflags |= VPO_SWAPINPROG; } - bp->b_npages = j - i; PCPU_INC(cnt.v_swapin); PCPU_ADD(cnt.v_swappgsin, bp->b_npages); @@ -1221,8 +1176,8 @@ swap_pager_getpages(vm_object_t object, * is set in the meta-data. */ VM_OBJECT_WLOCK(object); - while ((mreq->oflags & VPO_SWAPINPROG) != 0) { - mreq->oflags |= VPO_SWAPSLEEP; + while ((m[0]->oflags & VPO_SWAPINPROG) != 0) { + m[0]->oflags |= VPO_SWAPSLEEP; PCPU_INC(cnt.v_intrans); if (VM_OBJECT_SLEEP(object, &object->paging_in_progress, PSWP, "swread", hz * 20)) { @@ -1233,15 +1188,18 @@ swap_pager_getpages(vm_object_t object, } *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***