From owner-svn-src-head@freebsd.org Tue Nov 27 00:36:37 2018 Return-Path: Delivered-To: svn-src-head@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id DD183114A35C; Tue, 27 Nov 2018 00:36:36 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 80D6A741D7; Tue, 27 Nov 2018 00:36:36 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 598F51372D; Tue, 27 Nov 2018 00:36:36 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id wAR0aa4A035844; Tue, 27 Nov 2018 00:36:36 GMT (envelope-from imp@FreeBSD.org) Received: (from imp@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id wAR0aZZB035842; Tue, 27 Nov 2018 00:36:35 GMT (envelope-from imp@FreeBSD.org) Message-Id: <201811270036.wAR0aZZB035842@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: imp set sender to imp@FreeBSD.org using -f From: Warner Losh Date: Tue, 27 Nov 2018 00:36:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r341005 - head/sys/cam X-SVN-Group: head X-SVN-Commit-Author: imp X-SVN-Commit-Paths: head/sys/cam X-SVN-Commit-Revision: 341005 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 80D6A741D7 X-Spamd-Result: default: False [1.22 / 15.00]; local_wl_from(0.00)[FreeBSD.org]; NEURAL_SPAM_SHORT(0.02)[0.022,0]; NEURAL_SPAM_MEDIUM(0.69)[0.687,0]; NEURAL_SPAM_LONG(0.51)[0.511,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-Rspamd-Server: mx1.freebsd.org X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2018 00:36:37 -0000 Author: imp Date: Tue Nov 27 00:36:35 2018 New Revision: 341005 URL: https://svnweb.freebsd.org/changeset/base/341005 Log: NVME trim clocking Add the ability to set two goals for trims in the I/O scheduler. The first goal is the number of BIO_DELETEs to accumulate (kern.cam.XX.U.trim_goal). When non-zero, this many trims will be accumulated before we start to transfer them to lower layers. This is useful for devices that like to get lots of trims all at once in one transaction (not all devices are like this, and some vary by workload). The second is a number of ticks to defer trims. If you've set a trim goal, then kern.cam.XX.U.trim_ticks controls how long the system will defer those trims before timing out and sending them anyway. It has no effect when trim_goal is 0. In any event, a BIO_FLUSH will cause all the TRIMs to be released to the periph drivers. This may be a minor overloading of what BIO_FLUSH is supposed to mean, but it's useful to preserve other ordering semantics that users of BIO_FLUSH reply on. Sponsored by: Netflix, Inc Modified: head/sys/cam/cam_iosched.c head/sys/cam/cam_iosched.h Modified: head/sys/cam/cam_iosched.c ============================================================================== --- head/sys/cam/cam_iosched.c Mon Nov 26 23:09:45 2018 (r341004) +++ head/sys/cam/cam_iosched.c Tue Nov 27 00:36:35 2018 (r341005) @@ -277,6 +277,10 @@ struct cam_iosched_softc { /* scheduler flags < 16, user flags >= 16 */ uint32_t flags; int sort_io_queue; + int trim_goal; /* # of trims to queue before sending */ + int trim_ticks; /* Max ticks to hold trims */ + int last_trim_tick; /* Last 'tick' time ld a trim */ + int queued_trims; /* Number of trims in the queue */ #ifdef CAM_IOSCHED_DYNAMIC int read_bias; /* Read bias setting */ int current_read_bias; /* Current read bias state */ @@ -751,6 +755,22 @@ cam_iosched_has_io(struct cam_iosched_softc *isc) static inline bool cam_iosched_has_more_trim(struct cam_iosched_softc *isc) { + + /* + * If we've set a trim_goal, then if we exceed that allow trims + * to be passed back to the driver. If we've also set a tick timeout + * allow trims back to the driver. Otherwise, don't allow trims yet. + */ + if (isc->trim_goal > 0) { + if (isc->queued_trims >= isc->trim_goal) + return true; + if (isc->queued_trims > 0 && + isc->trim_ticks > 0 && + ticks - isc->last_trim_tick > isc->trim_ticks) + return true; + return false; + } + return !(isc->flags & CAM_IOSCHED_FLAG_TRIM_ACTIVE) && bioq_first(&isc->trim_queue); } @@ -1131,14 +1151,21 @@ cam_iosched_fini(struct cam_iosched_softc *isc) void cam_iosched_sysctl_init(struct cam_iosched_softc *isc, struct sysctl_ctx_list *ctx, struct sysctl_oid *node) { -#ifdef CAM_IOSCHED_DYNAMIC struct sysctl_oid_list *n; -#endif - SYSCTL_ADD_INT(ctx, SYSCTL_CHILDREN(node), + n = SYSCTL_CHILDREN(node); + SYSCTL_ADD_INT(ctx, n, OID_AUTO, "sort_io_queue", CTLFLAG_RW | CTLFLAG_MPSAFE, &isc->sort_io_queue, 0, "Sort IO queue to try and optimise disk access patterns"); + SYSCTL_ADD_INT(ctx, n, + OID_AUTO, "trim_goal", CTLFLAG_RW, + &isc->trim_goal, 0, + "Number of trims to try to accumulate before sending to hardware"); + SYSCTL_ADD_INT(ctx, n, + OID_AUTO, "trim_ticks", CTLFLAG_RW, + &isc->trim_goal, 0, + "IO Schedul qaunta to hold back trims for when accumulating"); #ifdef CAM_IOSCHED_DYNAMIC if (!do_dynamic_iosched) @@ -1193,6 +1220,41 @@ cam_iosched_set_latfcn(struct cam_iosched_softc *isc, } /* + * Client drivers can set two parameters. "goal" is the number of BIO_DELETEs + * that will be queued up before iosched will "release" the trims to the client + * driver to wo with what they will (usually combine as many as possible). If we + * don't get this many, after trim_ticks we'll submit the I/O anyway with + * whatever we have. We do need an I/O of some kind of to clock the deferred + * trims out to disk. Since we will eventually get a write for the super block + * or something before we shutdown, the trims will complete. To be safe, when a + * BIO_FLUSH is presented to the iosched work queue, we set the ticks time far + * enough in the past so we'll present the BIO_DELETEs to the client driver. + * There might be a race if no BIO_DELETESs were queued, a BIO_FLUSH comes in + * and then a BIO_DELETE is sent down. No know client does this, and there's + * already a race between an ordered BIO_FLUSH and any BIO_DELETEs in flight, + * but no client depends on the ordering being honored. + * + * XXX I'm not sure what the interaction between UFS direct BIOs and the BUF + * flushing on shutdown. I think there's bufs that would be dependent on the BIO + * finishing to write out at least metadata, so we'll be fine. To be safe, keep + * the number of ticks low (less than maybe 10s) to avoid shutdown races. + */ + +void +cam_iosched_set_trim_goal(struct cam_iosched_softc *isc, int goal) +{ + + isc->trim_goal = goal; +} + +void +cam_iosched_set_trim_ticks(struct cam_iosched_softc *isc, int trim_ticks) +{ + + isc->trim_ticks = trim_ticks; +} + +/* * Flush outstanding I/O. Consumers of this library don't know all the * queues we may keep, so this allows all I/O to be flushed in one * convenient call. @@ -1281,6 +1343,9 @@ void cam_iosched_put_back_trim(struct cam_iosched_softc *isc, struct bio *bp) { bioq_insert_head(&isc->trim_queue, bp); + if (isc->queued_trims == 0) + isc->last_trim_tick = ticks; + isc->queued_trims++; #ifdef CAM_IOSCHED_DYNAMIC isc->trim_stats.queued++; isc->trim_stats.total--; /* since we put it back, don't double count */ @@ -1304,6 +1369,8 @@ cam_iosched_next_trim(struct cam_iosched_softc *isc) if (bp == NULL) return NULL; bioq_remove(&isc->trim_queue, bp); + isc->queued_trims--; + isc->last_trim_tick = ticks; /* Reset the tick timer when we take trims */ #ifdef CAM_IOSCHED_DYNAMIC isc->trim_stats.queued--; isc->trim_stats.total++; @@ -1430,12 +1497,22 @@ cam_iosched_queue_work(struct cam_iosched_softc *isc, { /* - * Put all trims on the trim queue sorted, since we know - * that the collapsing code requires this. Otherwise put - * the work on the bio queue. + * If we get a BIO_FLUSH, and we're doing delayed BIO_DELETEs then we + * set the last tick time to one less than the current ticks minus the + * delay to force the BIO_DELETEs to be presented to the client driver. */ + if (bp->bio_cmd == BIO_FLUSH && isc->trim_ticks > 0) + isc->last_trim_tick = ticks - isc->trim_ticks - 1; + + /* + * Put all trims on the trim queue. Otherwise put the work on the bio + * queue. + */ if (bp->bio_cmd == BIO_DELETE) { bioq_insert_tail(&isc->trim_queue, bp); + if (isc->queued_trims == 0) + isc->last_trim_tick = ticks; + isc->queued_trims++; #ifdef CAM_IOSCHED_DYNAMIC isc->trim_stats.in++; isc->trim_stats.queued++; Modified: head/sys/cam/cam_iosched.h ============================================================================== --- head/sys/cam/cam_iosched.h Mon Nov 26 23:09:45 2018 (r341004) +++ head/sys/cam/cam_iosched.h Tue Nov 27 00:36:35 2018 (r341005) @@ -101,6 +101,7 @@ void cam_iosched_clr_work_flags(struct cam_iosched_sof void cam_iosched_trim_done(struct cam_iosched_softc *isc); int cam_iosched_bio_complete(struct cam_iosched_softc *isc, struct bio *bp, union ccb *done_ccb); void cam_iosched_set_latfcn(struct cam_iosched_softc *isc, cam_iosched_latfcn_t, void *); - +void cam_iosched_set_trim_goal(struct cam_iosched_softc *isc, int goal); +void cam_iosched_set_trim_ticks(struct cam_iosched_softc *isc, int ticks); #endif #endif