From owner-svn-src-stable@FreeBSD.ORG Mon Sep 20 23:39:01 2010 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F5B4106566B; Mon, 20 Sep 2010 23:39:01 +0000 (UTC) (envelope-from gibbs@FreeBSD.org) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:4f8:fff6::2c]) by mx1.freebsd.org (Postfix) with ESMTP id 1B8008FC0C; Mon, 20 Sep 2010 23:39:01 +0000 (UTC) Received: from svn.freebsd.org (localhost [127.0.0.1]) by svn.freebsd.org (8.14.3/8.14.3) with ESMTP id o8KNd1g3071610; Mon, 20 Sep 2010 23:39:01 GMT (envelope-from gibbs@svn.freebsd.org) Received: (from gibbs@localhost) by svn.freebsd.org (8.14.3/8.14.3/Submit) id o8KNd0lG071602; Mon, 20 Sep 2010 23:39:00 GMT (envelope-from gibbs@svn.freebsd.org) Message-Id: <201009202339.o8KNd0lG071602@svn.freebsd.org> From: "Justin T. Gibbs" Date: Mon, 20 Sep 2010 23:39:00 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: Subject: svn commit: r212939 - in stable/8/sys: cam/ata cam/scsi cddl/contrib/opensolaris/uts/common/fs/zfs geom geom/sched kern sys X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Sep 2010 23:39:01 -0000 Author: gibbs Date: Mon Sep 20 23:39:00 2010 New Revision: 212939 URL: http://svn.freebsd.org/changeset/base/212939 Log: MFC 212160: Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation ------------------------------------------------------------------------ Modified: stable/8/sys/cam/ata/ata_da.c stable/8/sys/cam/scsi/scsi_da.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c stable/8/sys/geom/geom_io.c stable/8/sys/geom/sched/subr_disk.c stable/8/sys/kern/subr_disk.c stable/8/sys/sys/bio.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/amd64/include/xen/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) stable/8/sys/contrib/dev/acpica/ (props changed) stable/8/sys/contrib/pf/ (props changed) stable/8/sys/dev/xen/xenpci/ (props changed) Modified: stable/8/sys/cam/ata/ata_da.c ============================================================================== --- stable/8/sys/cam/ata/ata_da.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/cam/ata/ata_da.c Mon Sep 20 23:39:00 2010 (r212939) @@ -870,7 +870,8 @@ adastart(struct cam_periph *periph, unio } bioq_remove(&softc->bio_queue, bp); - if ((softc->flags & ADA_FLAG_NEED_OTAG) != 0) { + if ((bp->bio_flags & BIO_ORDERED) != 0 + || (softc->flags & ADA_FLAG_NEED_OTAG) != 0) { softc->flags &= ~ADA_FLAG_NEED_OTAG; softc->ordered_tag_count++; tag_code = 0; Modified: stable/8/sys/cam/scsi/scsi_da.c ============================================================================== --- stable/8/sys/cam/scsi/scsi_da.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/cam/scsi/scsi_da.c Mon Sep 20 23:39:00 2010 (r212939) @@ -1318,7 +1318,8 @@ dastart(struct cam_periph *periph, union bioq_remove(&softc->bio_queue, bp); - if ((softc->flags & DA_FLAG_NEED_OTAG) != 0) { + if ((bp->bio_flags & BIO_ORDERED) != 0 + || (softc->flags & DA_FLAG_NEED_OTAG) != 0) { softc->flags &= ~DA_FLAG_NEED_OTAG; softc->ordered_tag_count++; tag_code = MSG_ORDERED_Q_TAG; @@ -1332,7 +1333,8 @@ dastart(struct cam_periph *periph, union /*retries*/da_retry_count, /*cbfcnp*/dadone, /*tag_action*/tag_code, - /*read_op*/bp->bio_cmd == BIO_READ, + /*read_op*/bp->bio_cmd + == BIO_READ, /*byte2*/0, softc->minimum_cmd_size, /*lba*/bp->bio_pblkno, @@ -1341,17 +1343,24 @@ dastart(struct cam_periph *periph, union /*data_ptr*/ bp->bio_data, /*dxfer_len*/ bp->bio_bcount, /*sense_len*/SSD_FULL_SIZE, - /*timeout*/da_default_timeout*1000); + da_default_timeout * 1000); break; case BIO_FLUSH: + /* + * BIO_FLUSH doesn't currently communicate + * range data, so we synchronize the cache + * over the whole disk. We also force + * ordered tag semantics the flush applies + * to all previously queued I/O. + */ scsi_synchronize_cache(&start_ccb->csio, /*retries*/1, /*cbfcnp*/dadone, - MSG_SIMPLE_Q_TAG, - /*begin_lba*/0,/* Cover the whole disk */ + MSG_ORDERED_Q_TAG, + /*begin_lba*/0, /*lb_count*/0, SSD_FULL_SIZE, - /*timeout*/da_default_timeout*1000); + da_default_timeout*1000); break; } start_ccb->ccb_h.ccb_state = DA_CCB_BUFFER_IO; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Mon Sep 20 23:39:00 2010 (r212939) @@ -597,6 +597,7 @@ sendreq: break; case ZIO_TYPE_IOCTL: bp->bio_cmd = BIO_FLUSH; + bp->bio_flags |= BIO_ORDERED; bp->bio_data = NULL; bp->bio_offset = cp->provider->mediasize; bp->bio_length = 0; Modified: stable/8/sys/geom/geom_io.c ============================================================================== --- stable/8/sys/geom/geom_io.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/geom/geom_io.c Mon Sep 20 23:39:00 2010 (r212939) @@ -265,6 +265,7 @@ g_io_flush(struct g_consumer *cp) g_trace(G_T_BIO, "bio_flush(%s)", cp->provider->name); bp = g_alloc_bio(); bp->bio_cmd = BIO_FLUSH; + bp->bio_flags |= BIO_ORDERED; bp->bio_done = NULL; bp->bio_attribute = NULL; bp->bio_offset = cp->provider->mediasize; Modified: stable/8/sys/geom/sched/subr_disk.c ============================================================================== --- stable/8/sys/geom/sched/subr_disk.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/geom/sched/subr_disk.c Mon Sep 20 23:39:00 2010 (r212939) @@ -86,7 +86,7 @@ __FBSDID("$FreeBSD$"); * bioq_remove() remove a generic element from the queue, act as * bioq_takefirst() if invoked on the head of the queue. * - * The semantic of these methods is the same of the operations + * The semantic of these methods is the same as the operations * on the underlying TAILQ, but with additional guarantees on * subsequent bioq_disksort() calls. E.g. bioq_insert_tail() * can be useful for making sure that all previous ops are flushed @@ -115,10 +115,10 @@ void gs_bioq_remove(struct bio_queue_head *head, struct bio *bp) { - if (bp == TAILQ_FIRST(&head->queue)) - head->last_offset = bp->bio_offset + bp->bio_length; - - if (bp == head->insert_point) + if (head->insert_point == NULL) { + if (bp == TAILQ_FIRST(&head->queue)) + head->last_offset = bp->bio_offset + bp->bio_length; + } else if (bp == head->insert_point) head->insert_point = NULL; TAILQ_REMOVE(&head->queue, bp, bio_queue); @@ -137,7 +137,8 @@ void gs_bioq_insert_head(struct bio_queue_head *head, struct bio *bp) { - head->last_offset = bp->bio_offset; + if (head->insert_point == NULL) + head->last_offset = bp->bio_offset; TAILQ_INSERT_HEAD(&head->queue, bp, bio_queue); } @@ -147,6 +148,7 @@ gs_bioq_insert_tail(struct bio_queue_hea TAILQ_INSERT_TAIL(&head->queue, bp, bio_queue); head->insert_point = bp; + head->last_offset = bp->bio_offset; } struct bio * @@ -189,13 +191,28 @@ gs_bioq_bio_key(struct bio_queue_head *h void gs_bioq_disksort(struct bio_queue_head *head, struct bio *bp) { - struct bio *cur, *prev = NULL; - uoff_t key = gs_bioq_bio_key(head, bp); + struct bio *cur, *prev; + uoff_t key; + if ((bp->bio_flags & BIO_ORDERED) != 0) { + /* + * Ordered transactions can only be dispatched + * after any currently queued transactions. They + * also have barrier semantics - no transactions + * queued in the future can pass them. + */ + gs_bioq_insert_tail(head, bp); + return; + } + + prev = NULL; + key = gs_bioq_bio_key(head, bp); cur = TAILQ_FIRST(&head->queue); - if (head->insert_point) - cur = head->insert_point; + if (head->insert_point) { + prev = head->insert_point; + cur = TAILQ_NEXT(head->insert_point, bio_queue); + } while (cur != NULL && key >= gs_bioq_bio_key(head, cur)) { prev = cur; Modified: stable/8/sys/kern/subr_disk.c ============================================================================== --- stable/8/sys/kern/subr_disk.c Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/kern/subr_disk.c Mon Sep 20 23:39:00 2010 (r212939) @@ -127,7 +127,7 @@ disk_err(struct bio *bp, const char *wha * bioq_remove() remove a generic element from the queue, act as * bioq_takefirst() if invoked on the head of the queue. * - * The semantic of these methods is the same of the operations + * The semantic of these methods is the same as the operations * on the underlying TAILQ, but with additional guarantees on * subsequent bioq_disksort() calls. E.g. bioq_insert_tail() * can be useful for making sure that all previous ops are flushed @@ -156,10 +156,10 @@ void bioq_remove(struct bio_queue_head *head, struct bio *bp) { - if (bp == TAILQ_FIRST(&head->queue)) - head->last_offset = bp->bio_offset + bp->bio_length; - - if (bp == head->insert_point) + if (head->insert_point == NULL) { + if (bp == TAILQ_FIRST(&head->queue)) + head->last_offset = bp->bio_offset + bp->bio_length; + } else if (bp == head->insert_point) head->insert_point = NULL; TAILQ_REMOVE(&head->queue, bp, bio_queue); @@ -178,7 +178,8 @@ void bioq_insert_head(struct bio_queue_head *head, struct bio *bp) { - head->last_offset = bp->bio_offset; + if (head->insert_point == NULL) + head->last_offset = bp->bio_offset; TAILQ_INSERT_HEAD(&head->queue, bp, bio_queue); } @@ -188,6 +189,7 @@ bioq_insert_tail(struct bio_queue_head * TAILQ_INSERT_TAIL(&head->queue, bp, bio_queue); head->insert_point = bp; + head->last_offset = bp->bio_offset; } struct bio * @@ -230,13 +232,28 @@ bioq_bio_key(struct bio_queue_head *head void bioq_disksort(struct bio_queue_head *head, struct bio *bp) { - struct bio *cur, *prev = NULL; - uoff_t key = bioq_bio_key(head, bp); + struct bio *cur, *prev; + uoff_t key; + if ((bp->bio_flags & BIO_ORDERED) != 0) { + /* + * Ordered transactions can only be dispatched + * after any currently queued transactions. They + * also have barrier semantics - no transactions + * queued in the future can pass them. + */ + bioq_insert_tail(head, bp); + return; + } + + prev = NULL; + key = bioq_bio_key(head, bp); cur = TAILQ_FIRST(&head->queue); - if (head->insert_point) - cur = head->insert_point; + if (head->insert_point) { + prev = head->insert_point; + cur = TAILQ_NEXT(head->insert_point, bio_queue); + } while (cur != NULL && key >= bioq_bio_key(head, cur)) { prev = cur; Modified: stable/8/sys/sys/bio.h ============================================================================== --- stable/8/sys/sys/bio.h Mon Sep 20 23:36:54 2010 (r212938) +++ stable/8/sys/sys/bio.h Mon Sep 20 23:39:00 2010 (r212939) @@ -54,6 +54,7 @@ #define BIO_ERROR 0x01 #define BIO_DONE 0x02 #define BIO_ONQUEUE 0x04 +#define BIO_ORDERED 0x08 #ifdef _KERNEL struct disk;