From owner-svn-src-all@freebsd.org Tue Sep 19 08:39:55 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id C1653E050E2; Tue, 19 Sep 2017 08:39:55 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9D73471288; Tue, 19 Sep 2017 08:39:55 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id v8J8dsrk038719; Tue, 19 Sep 2017 08:39:54 GMT (envelope-from avg@FreeBSD.org) Received: (from avg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id v8J8dsot038716; Tue, 19 Sep 2017 08:39:54 GMT (envelope-from avg@FreeBSD.org) Message-Id: <201709190839.v8J8dsot038716@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: avg set sender to avg@FreeBSD.org using -f From: Andriy Gapon Date: Tue, 19 Sep 2017 08:39:54 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r323745 - in stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-10 X-SVN-Commit-Author: avg X-SVN-Commit-Paths: in stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Commit-Revision: 323745 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Sep 2017 08:39:55 -0000 Author: avg Date: Tue Sep 19 08:39:54 2017 New Revision: 323745 URL: https://svnweb.freebsd.org/changeset/base/323745 Log: MFC r320352: zfs: port vdev_file part of illumos change 3306 3306 zdb should be able to issue reads in parallel illumos/illumos-gate/31d7e8fa33fae995f558673adb22641b5aa8b6e1 https://www.illumos.org/issues/3306 The upstream change was made before we started to import upstream commits individually. It was imported into the illumos vendor area as r242733. That commit was MFV-ed in r260138, but as the commit message says vdev_file.c was left intact. This commit actually implements the parallel I/O for vdev_file using a taskqueue with multiple thread. This implementation does not depend on the illumos or FreeBSD bio interface at all, but uses zio_t to pass around all the relevent data. So, the code looks a bit different from the upstream. This commit also incorporates ZoL commit zfsonlinux/zfs/bc25c9325b0e5ced897b9820dad239539d561ec9 that fixed https://github.com/zfsonlinux/zfs/issues/2270 We need to use a dedicated taskqueue for exactly the same reason as ZoL as we do not implement TASKQ_DYNAMIC. Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Tue Sep 19 08:34:13 2017 (r323744) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Tue Sep 19 08:39:54 2017 (r323745) @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -2013,6 +2014,7 @@ spa_init(int mode) dmu_init(); zil_init(); vdev_cache_stat_init(); + vdev_file_init(); zfs_prop_init(); zpool_prop_init(); zpool_feature_init(); @@ -2032,6 +2034,7 @@ spa_fini(void) spa_evict_all(); + vdev_file_fini(); vdev_cache_stat_fini(); zil_fini(); dmu_fini(); Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h Tue Sep 19 08:34:13 2017 (r323744) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_file.h Tue Sep 19 08:39:54 2017 (r323745) @@ -39,6 +39,9 @@ typedef struct vdev_file { vnode_t *vf_vnode; } vdev_file_t; +extern void vdev_file_init(void); +extern void vdev_file_fini(void); + #ifdef __cplusplus } #endif Modified: stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c ============================================================================== --- stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Tue Sep 19 08:34:13 2017 (r323744) +++ stable/10/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_file.c Tue Sep 19 08:39:54 2017 (r323745) @@ -35,6 +35,21 @@ * Virtual device vector for files. */ +static taskq_t *vdev_file_taskq; + +void +vdev_file_init(void) +{ + vdev_file_taskq = taskq_create("z_vdev_file", MAX(max_ncpus, 16), + minclsyspri, max_ncpus, INT_MAX, 0); +} + +void +vdev_file_fini(void) +{ + taskq_destroy(vdev_file_taskq); +} + static void vdev_file_hold(vdev_t *vd) { @@ -156,27 +171,58 @@ vdev_file_close(vdev_t *vd) vd->vdev_tsd = NULL; } +/* + * Implements the interrupt side for file vdev types. This routine will be + * called when the I/O completes allowing us to transfer the I/O to the + * interrupt taskqs. For consistency, the code structure mimics disk vdev + * types. + */ static void -vdev_file_io_start(zio_t *zio) +vdev_file_io_intr(zio_t *zio) { + zio_delay_interrupt(zio); +} + +static void +vdev_file_io_strategy(void *arg) +{ + zio_t *zio = arg; vdev_t *vd = zio->io_vd; vdev_file_t *vf; vnode_t *vp; ssize_t resid; - if (!vdev_readable(vd)) { - zio->io_error = SET_ERROR(ENXIO); - zio_interrupt(zio); - return; - } - vf = vd->vdev_tsd; vp = vf->vf_vnode; + ASSERT(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE); + zio->io_error = vn_rdwr(zio->io_type == ZIO_TYPE_READ ? + UIO_READ : UIO_WRITE, vp, zio->io_data, zio->io_size, + zio->io_offset, UIO_SYSSPACE, 0, RLIM64_INFINITY, kcred, &resid); + + if (resid != 0 && zio->io_error == 0) + zio->io_error = ENOSPC; + + vdev_file_io_intr(zio); +} + +static void +vdev_file_io_start(zio_t *zio) +{ + vdev_t *vd = zio->io_vd; + vdev_file_t *vf = vd->vdev_tsd; + if (zio->io_type == ZIO_TYPE_IOCTL) { + /* XXPOLICY */ + if (!vdev_readable(vd)) { + zio->io_error = SET_ERROR(ENXIO); + zio_interrupt(zio); + return; + } + switch (zio->io_cmd) { case DKIOCFLUSHWRITECACHE: - zio->io_error = VOP_FSYNC(vp, FSYNC | FDSYNC, + zio->io_error = VOP_FSYNC(vf->vf_vnode, FSYNC | FDSYNC, kcred, NULL); break; default: @@ -190,19 +236,8 @@ vdev_file_io_start(zio_t *zio) ASSERT(zio->io_type == ZIO_TYPE_READ || zio->io_type == ZIO_TYPE_WRITE); zio->io_target_timestamp = zio_handle_io_delay(zio); - zio->io_error = vn_rdwr(zio->io_type == ZIO_TYPE_READ ? - UIO_READ : UIO_WRITE, vp, zio->io_data, zio->io_size, - zio->io_offset, UIO_SYSSPACE, 0, RLIM64_INFINITY, kcred, &resid); - - if (resid != 0 && zio->io_error == 0) - zio->io_error = ENOSPC; - - zio_delay_interrupt(zio); - -#ifdef illumos - VERIFY3U(taskq_dispatch(system_taskq, vdev_file_io_strategy, bp, + VERIFY3U(taskq_dispatch(vdev_file_taskq, vdev_file_io_strategy, zio, TQ_SLEEP), !=, 0); -#endif } /* ARGSUSED */