Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 24 May 2013 01:04:06 +0100
From:      "Steven Hartland" <killing@multiplay.co.uk>
To:        "Steven Hartland" <killing@multiplay.co.uk>, "Ajit Jain" <ajit.jain@cloudbyte.com>
Cc:        freebsd-fs <freebsd-fs@freebsd.org>
Subject:   Re: seeing data corruption with zfs trim functionality
Message-ID:  <9219DF6C2DFD422998FF9B5526A06EF1@multiplay.co.uk>
References:  <CAA71u6Y5dKZ9O0rqxCpx-9t7DYgTnPZSoNy-iHOnmzrOUYp%2Bvw@mail.gmail.com> <60316751643743738AB83DABC6A5934B@multiplay.co.uk> <20130429105143.GA1492@icarus.home.lan> <3AD1AB31003D49B2BF2EA7DD411B38A2@multiplay.co.uk> <C6AA4D0A7C49469ABB3C7440B1BCC108@multiplay.co.uk> <CAA71u6Zh7BbbdC=utqfR2MD1Nn=9euUDXHKqqu9NyBG-Jx%2B=Ow@mail.gmail.com> <9681E07546D348168052D4FC5365B4CD@multiplay.co.uk> <CAA71u6ZuO9CF0ECFS4z07-E5qPea-6SfNwkvhr_g6pFT5MV5yQ@mail.gmail.com> <CAA71u6YKGHDRVg6W_xnCNaA68bJvAZ2Lkp-UisiPqb1vKjJhfA@mail.gmail.com> <3E9CA9334E6F433A8F135ACD5C237340@multiplay.co.uk> <CAA71u6YZAKrmfTLU32f8UmYecmydwiqRT-OrR1ukZ9V6PGsU%2Bw@mail.gmail.com> <A05ACD84EB974E80B7142CE9982E479C@multiplay.co.uk> <93D0677B373A452BAF58C8EA6823783D@multiplay.co.uk> <CAA71u6bZ_4fb9FxYSwcrHBBApkZog30iQJGyTERi-xFMksud1g@mail.gmail.com> <35ABA7AAEB7F4D86A1ED54C4C47FEB49@multiplay.co.uk>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.

------=_NextPart_000_0391_01CE581A.96C6FBA0
Content-Type: text/plain;
	format=flowed;
	charset="iso-8859-1";
	reply-type=original
Content-Transfer-Encoding: 7bit

Updated trim patch attached, this time created with --show-copies-as-adds
so you get all the files.

/me shakes his fist at svn!!

Thanks to Michael Moll for the push in the right direction on that :)

    Regards
    Steve

----- Original Message ----- 
From: "Steven Hartland" 

> I've attacked the two patch sets I'm looking to MFC to stable-9, one
> adds BIO_DELETE CAM changes and the other is ZFS TRIM support.
> 
> They should both apply cleanly to stable-9, if you could test with
> those on your machine and let me know.


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster@multiplay.co.uk.
------=_NextPart_000_0391_01CE581A.96C6FBA0
Content-Type: application/octet-stream;
	name="mfc-zfs-misc-stable-9-v2.patch"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="mfc-zfs-misc-stable-9-v2.patch"

Index: cddl/lib=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- cddl/lib	(revision 250526)=0A=
+++ cddl/lib	(working copy)=0A=
=0A=
Property changes on: cddl/lib=0A=
___________________________________________________________________=0A=
Modified: svn:mergeinfo=0A=
   Merged /head/cddl/lib:r240868=0A=
Index: cddl/lib/libzpool/Makefile=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- cddl/lib/libzpool/Makefile	(revision 250526)=0A=
+++ cddl/lib/libzpool/Makefile	(working copy)=0A=
@@ -26,7 +26,7 @@=0A=
 =0A=
 LIB=3D		zpool=0A=
 =0A=
-ZFS_COMMON_SRCS=3D ${ZFS_COMMON_OBJS:C/.o$/.c/} vdev_file.c=0A=
+ZFS_COMMON_SRCS=3D ${ZFS_COMMON_OBJS:C/.o$/.c/} vdev_file.c trim_map.c=0A=
 ZFS_SHARED_SRCS=3D ${ZFS_SHARED_OBJS:C/.o$/.c/}=0A=
 KERNEL_SRCS=3D	kernel.c taskq.c util.c=0A=
 LIST_SRCS=3D	list.c=0A=
Index: UPDATING=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- UPDATING	(revision 250526)=0A=
+++ UPDATING	(working copy)=0A=
@@ -11,6 +11,22 @@=0A=
 Items affecting the ports and packages system can be found in=0A=
 /usr/ports/UPDATING.  Please read that file before running portupgrade.=0A=
 =0A=
+20130512:=0A=
+	Added ZFS TRIM support which is enabled by default. To disable=0A=
+	ZFS TRIM support set vfs.zfs.trim.enabled=3D0 in loader.conf.=0A=
+=0A=
+	Creating new ZFS pools and adding new devices to existing pools=0A=
+	first performs a full device level TRIM, which can take a significant=0A=
+	amount of time. Set the sysctl vfs.zfs.vdev.trim_on_init to 0 to=0A=
+	disable this behaviour.=0A=
+=0A=
+	ZFS TRIM requires the underlying device support BIO_DELETE which=0A=
+	is currently provided by methods such as ATA TRIM and SCSI UNMAP=0A=
+	via CAM, which are typically supported by SSD's.=0A=
+=0A=
+	Stats for ZFS TRIM can be monitored by looking at the sysctl's=0A=
+	under kstat.zfs.misc.zio_trim.=0A=
+=0A=
 20130430:=0A=
 	The mergemaster command now uses the default MAKEOBJDIRPREFIX=0A=
 	rather than creating it's own in the temporary directory in=0A=
Index: sys=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys	(revision 250526)=0A=
+++ sys	(working copy)=0A=
=0A=
Property changes on: sys=0A=
___________________________________________________________________=0A=
Modified: svn:mergeinfo=0A=
   Merged =
/head/sys:r240868,244155,244187-244188,248572,248574-248577,248602,249921=0A=
Index: sys/modules=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/modules	(revision 250526)=0A=
+++ sys/modules	(working copy)=0A=
=0A=
Property changes on: sys/modules=0A=
___________________________________________________________________=0A=
Modified: svn:mergeinfo=0A=
   Merged /head/sys/modules:r240868=0A=
Index: sys/modules/zfs/Makefile=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/modules/zfs/Makefile	(revision 250526)=0A=
+++ sys/modules/zfs/Makefile	(working copy)=0A=
@@ -72,6 +72,7 @@=0A=
 ZFS_SRCS=3D	${ZFS_OBJS:C/.o$/.c/}=0A=
 SRCS+=3D	${ZFS_SRCS}=0A=
 SRCS+=3D	vdev_geom.c=0A=
+SRCS+=3D	trim_map.c=0A=
 =0A=
 # Use FreeBSD's namecache.=0A=
 CFLAGS+=3D-DFREEBSD_NAMECACHE=0A=
Index: sys/cddl/contrib/opensolaris=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris	(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris	(working copy)=0A=
=0A=
Property changes on: sys/cddl/contrib/opensolaris=0A=
___________________________________________________________________=0A=
Modified: svn:mergeinfo=0A=
   Merged =
/head/sys/cddl/contrib/opensolaris:r240868,244155,244187-244188,248572,24=
8574-248577,248602,249921=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c	=
(working copy)=0A=
@@ -293,10 +293,11 @@=0A=
 		c =3D vdev_mirror_child_select(zio);=0A=
 		children =3D (c >=3D 0);=0A=
 	} else {=0A=
-		ASSERT(zio->io_type =3D=3D ZIO_TYPE_WRITE);=0A=
+		ASSERT(zio->io_type =3D=3D ZIO_TYPE_WRITE ||=0A=
+		    zio->io_type =3D=3D ZIO_TYPE_FREE);=0A=
 =0A=
 		/*=0A=
-		 * Writes go to all children.=0A=
+		 * Writes and frees go to all children.=0A=
 		 */=0A=
 		c =3D 0;=0A=
 		children =3D mm->mm_children;=0A=
@@ -377,6 +378,8 @@=0A=
 				zio->io_error =3D vdev_mirror_worst_error(mm);=0A=
 		}=0A=
 		return;=0A=
+	} else if (zio->io_type =3D=3D ZIO_TYPE_FREE) {=0A=
+		return;=0A=
 	}=0A=
 =0A=
 	ASSERT(zio->io_type =3D=3D ZIO_TYPE_READ);=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c	(working copy)=0A=
@@ -83,6 +83,11 @@=0A=
 TUNABLE_INT("vfs.zfs.cache_flush_disable", &zfs_nocacheflush);=0A=
 SYSCTL_INT(_vfs_zfs, OID_AUTO, cache_flush_disable, CTLFLAG_RDTUN,=0A=
     &zfs_nocacheflush, 0, "Disable cache flush");=0A=
+boolean_t zfs_trim_enabled =3D B_TRUE;=0A=
+SYSCTL_DECL(_vfs_zfs_trim);=0A=
+TUNABLE_INT("vfs.zfs.trim.enabled", &zfs_trim_enabled);=0A=
+SYSCTL_INT(_vfs_zfs_trim, OID_AUTO, enabled, CTLFLAG_RDTUN, =
&zfs_trim_enabled, 0,=0A=
+    "Enable ZFS TRIM");=0A=
 =0A=
 static kmem_cache_t *zil_lwb_cache;=0A=
 =0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c	(revision =
0)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c	(working =
copy)=0A=
@@ -0,0 +1,638 @@=0A=
+/*=0A=
+ * CDDL HEADER START=0A=
+ *=0A=
+ * The contents of this file are subject to the terms of the=0A=
+ * Common Development and Distribution License (the "License").=0A=
+ * You may not use this file except in compliance with the License.=0A=
+ *=0A=
+ * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE=0A=
+ * or http://www.opensolaris.org/os/licensing.=0A=
+ * See the License for the specific language governing permissions=0A=
+ * and limitations under the License.=0A=
+ *=0A=
+ * When distributing Covered Code, include this CDDL HEADER in each=0A=
+ * file and include the License file at usr/src/OPENSOLARIS.LICENSE.=0A=
+ * If applicable, add the following below this CDDL HEADER, with the=0A=
+ * fields enclosed by brackets "[]" replaced with your own identifying=0A=
+ * information: Portions Copyright [yyyy] [name of copyright owner]=0A=
+ *=0A=
+ * CDDL HEADER END=0A=
+ */=0A=
+/*=0A=
+ * Copyright (c) 2012 Pawel Jakub Dawidek <pawel@dawidek.net>.=0A=
+ * All rights reserved.=0A=
+ */=0A=
+=0A=
+#include <sys/zfs_context.h>=0A=
+#include <sys/spa_impl.h>=0A=
+#include <sys/vdev_impl.h>=0A=
+#include <sys/trim_map.h>=0A=
+#include <sys/time.h>=0A=
+=0A=
+/*=0A=
+ * Calculate the zio end, upgrading based on ashift which would be=0A=
+ * done by zio_vdev_io_start.=0A=
+ *=0A=
+ * This makes free range consolidation much more effective=0A=
+ * than it would otherwise be as well as ensuring that entire=0A=
+ * blocks are invalidated by writes.=0A=
+ */=0A=
+#define	TRIM_ZIO_END(vd, offset, size)	(offset +		\=0A=
+ 	P2ROUNDUP(size, 1ULL << vd->vdev_top->vdev_ashift))=0A=
+=0A=
+#define TRIM_MAP_SINC(tm, size)					\=0A=
+	atomic_add_64(&(tm)->tm_bytes, (size))=0A=
+=0A=
+#define TRIM_MAP_SDEC(tm, size)					\=0A=
+	atomic_add_64(&(tm)->tm_bytes, -(size))=0A=
+=0A=
+#define TRIM_MAP_QINC(tm)					\=0A=
+	atomic_inc_64(&(tm)->tm_pending);			\=0A=
+=0A=
+#define TRIM_MAP_QDEC(tm)					\=0A=
+	atomic_dec_64(&(tm)->tm_pending);=0A=
+=0A=
+typedef struct trim_map {=0A=
+	list_t		tm_head;		/* List of segments sorted by txg. */=0A=
+	avl_tree_t	tm_queued_frees;	/* AVL tree of segments waiting for TRIM. =
*/=0A=
+	avl_tree_t	tm_inflight_frees;	/* AVL tree of in-flight TRIMs. */=0A=
+	avl_tree_t	tm_inflight_writes;	/* AVL tree of in-flight writes. */=0A=
+	list_t		tm_pending_writes;	/* Writes blocked on in-flight frees. */=0A=
+	kmutex_t	tm_lock;=0A=
+	uint64_t	tm_pending;		/* Count of pending TRIMs. */=0A=
+	uint64_t	tm_bytes;		/* Total size in bytes of queued TRIMs. */=0A=
+} trim_map_t;=0A=
+=0A=
+typedef struct trim_seg {=0A=
+	avl_node_t	ts_node;	/* AVL node. */=0A=
+	list_node_t	ts_next;	/* List element. */=0A=
+	uint64_t	ts_start;	/* Starting offset of this segment. */=0A=
+	uint64_t	ts_end;		/* Ending offset (non-inclusive). */=0A=
+	uint64_t	ts_txg;		/* Segment creation txg. */=0A=
+	hrtime_t	ts_time;	/* Segment creation time. */=0A=
+} trim_seg_t;=0A=
+=0A=
+extern boolean_t zfs_trim_enabled;=0A=
+=0A=
+static u_int trim_txg_delay =3D 32;=0A=
+static u_int trim_timeout =3D 30;=0A=
+static u_int trim_max_interval =3D 1;=0A=
+/* Limit outstanding TRIMs to 2G (max size for a single TRIM request) */=0A=
+static uint64_t trim_vdev_max_bytes =3D 2147483648;=0A=
+/* Limit outstanding TRIMs to 64 (max ranges for a single TRIM request) =
*/	=0A=
+static u_int trim_vdev_max_pending =3D 64;=0A=
+=0A=
+SYSCTL_DECL(_vfs_zfs);=0A=
+SYSCTL_NODE(_vfs_zfs, OID_AUTO, trim, CTLFLAG_RD, 0, "ZFS TRIM");=0A=
+=0A=
+TUNABLE_INT("vfs.zfs.trim.txg_delay", &trim_txg_delay);=0A=
+SYSCTL_UINT(_vfs_zfs_trim, OID_AUTO, txg_delay, CTLFLAG_RWTUN, =
&trim_txg_delay,=0A=
+    0, "Delay TRIMs by up to this many TXGs");=0A=
+=0A=
+TUNABLE_INT("vfs.zfs.trim.timeout", &trim_timeout);=0A=
+SYSCTL_UINT(_vfs_zfs_trim, OID_AUTO, timeout, CTLFLAG_RWTUN, =
&trim_timeout, 0,=0A=
+    "Delay TRIMs by up to this many seconds");=0A=
+=0A=
+TUNABLE_INT("vfs.zfs.trim.max_interval", &trim_max_interval);=0A=
+SYSCTL_UINT(_vfs_zfs_trim, OID_AUTO, max_interval, CTLFLAG_RWTUN,=0A=
+    &trim_max_interval, 0,=0A=
+    "Maximum interval between TRIM queue processing (seconds)");=0A=
+=0A=
+SYSCTL_DECL(_vfs_zfs_vdev);=0A=
+TUNABLE_QUAD("vfs.zfs.vdev.trim_max_bytes", &trim_vdev_max_bytes);=0A=
+SYSCTL_QUAD(_vfs_zfs_vdev, OID_AUTO, trim_max_bytes, CTLFLAG_RWTUN,=0A=
+    &trim_vdev_max_bytes, 0,=0A=
+    "Maximum pending TRIM bytes for a vdev");=0A=
+=0A=
+TUNABLE_INT("vfs.zfs.vdev.trim_max_pending", &trim_vdev_max_pending);=0A=
+SYSCTL_UINT(_vfs_zfs_vdev, OID_AUTO, trim_max_pending, CTLFLAG_RWTUN,=0A=
+    &trim_vdev_max_pending, 0,=0A=
+    "Maximum pending TRIM segments for a vdev");=0A=
+=0A=
+=0A=
+static void trim_map_vdev_commit_done(spa_t *spa, vdev_t *vd);=0A=
+=0A=
+static int=0A=
+trim_map_seg_compare(const void *x1, const void *x2)=0A=
+{=0A=
+	const trim_seg_t *s1 =3D x1;=0A=
+	const trim_seg_t *s2 =3D x2;=0A=
+=0A=
+	if (s1->ts_start < s2->ts_start) {=0A=
+		if (s1->ts_end > s2->ts_start)=0A=
+			return (0);=0A=
+		return (-1);=0A=
+	}=0A=
+	if (s1->ts_start > s2->ts_start) {=0A=
+		if (s1->ts_start < s2->ts_end)=0A=
+			return (0);=0A=
+		return (1);=0A=
+	}=0A=
+	return (0);=0A=
+}=0A=
+=0A=
+static int=0A=
+trim_map_zio_compare(const void *x1, const void *x2)=0A=
+{=0A=
+	const zio_t *z1 =3D x1;=0A=
+	const zio_t *z2 =3D x2;=0A=
+=0A=
+	if (z1->io_offset < z2->io_offset) {=0A=
+		if (z1->io_offset + z1->io_size > z2->io_offset)=0A=
+			return (0);=0A=
+		return (-1);=0A=
+	}=0A=
+	if (z1->io_offset > z2->io_offset) {=0A=
+		if (z1->io_offset < z2->io_offset + z2->io_size)=0A=
+			return (0);=0A=
+		return (1);=0A=
+	}=0A=
+	return (0);=0A=
+}=0A=
+=0A=
+void=0A=
+trim_map_create(vdev_t *vd)=0A=
+{=0A=
+	trim_map_t *tm;=0A=
+=0A=
+	ASSERT(vd->vdev_ops->vdev_op_leaf);=0A=
+=0A=
+	if (!zfs_trim_enabled)=0A=
+		return;=0A=
+=0A=
+	tm =3D kmem_zalloc(sizeof (*tm), KM_SLEEP);=0A=
+	mutex_init(&tm->tm_lock, NULL, MUTEX_DEFAULT, NULL);=0A=
+	list_create(&tm->tm_head, sizeof (trim_seg_t),=0A=
+	    offsetof(trim_seg_t, ts_next));=0A=
+	list_create(&tm->tm_pending_writes, sizeof (zio_t),=0A=
+	    offsetof(zio_t, io_trim_link));=0A=
+	avl_create(&tm->tm_queued_frees, trim_map_seg_compare,=0A=
+	    sizeof (trim_seg_t), offsetof(trim_seg_t, ts_node));=0A=
+	avl_create(&tm->tm_inflight_frees, trim_map_seg_compare,=0A=
+	    sizeof (trim_seg_t), offsetof(trim_seg_t, ts_node));=0A=
+	avl_create(&tm->tm_inflight_writes, trim_map_zio_compare,=0A=
+	    sizeof (zio_t), offsetof(zio_t, io_trim_node));=0A=
+	vd->vdev_trimmap =3D tm;=0A=
+}=0A=
+=0A=
+void=0A=
+trim_map_destroy(vdev_t *vd)=0A=
+{=0A=
+	trim_map_t *tm;=0A=
+	trim_seg_t *ts;=0A=
+=0A=
+	ASSERT(vd->vdev_ops->vdev_op_leaf);=0A=
+=0A=
+	if (!zfs_trim_enabled)=0A=
+		return;=0A=
+=0A=
+	tm =3D vd->vdev_trimmap;=0A=
+	if (tm =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	/*=0A=
+	 * We may have been called before trim_map_vdev_commit_done()=0A=
+	 * had a chance to run, so do it now to prune the remaining=0A=
+	 * inflight frees.=0A=
+	 */=0A=
+	trim_map_vdev_commit_done(vd->vdev_spa, vd);=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+	while ((ts =3D list_head(&tm->tm_head)) !=3D NULL) {=0A=
+		avl_remove(&tm->tm_queued_frees, ts);=0A=
+		list_remove(&tm->tm_head, ts);=0A=
+		kmem_free(ts, sizeof (*ts));=0A=
+		TRIM_MAP_SDEC(tm, ts->ts_end - ts->ts_start);=0A=
+		TRIM_MAP_QDEC(tm);=0A=
+	}=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+=0A=
+	avl_destroy(&tm->tm_queued_frees);=0A=
+	avl_destroy(&tm->tm_inflight_frees);=0A=
+	avl_destroy(&tm->tm_inflight_writes);=0A=
+	list_destroy(&tm->tm_pending_writes);=0A=
+	list_destroy(&tm->tm_head);=0A=
+	mutex_destroy(&tm->tm_lock);=0A=
+	kmem_free(tm, sizeof (*tm));=0A=
+	vd->vdev_trimmap =3D NULL;=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_segment_add(trim_map_t *tm, uint64_t start, uint64_t end, =
uint64_t txg)=0A=
+{=0A=
+	avl_index_t where;=0A=
+	trim_seg_t tsearch, *ts_before, *ts_after, *ts;=0A=
+	boolean_t merge_before, merge_after;=0A=
+	hrtime_t time;=0A=
+=0A=
+	ASSERT(MUTEX_HELD(&tm->tm_lock));=0A=
+	VERIFY(start < end);=0A=
+=0A=
+	time =3D gethrtime();=0A=
+	tsearch.ts_start =3D start;=0A=
+	tsearch.ts_end =3D end;=0A=
+=0A=
+	ts =3D avl_find(&tm->tm_queued_frees, &tsearch, &where);=0A=
+	if (ts !=3D NULL) {=0A=
+		if (start < ts->ts_start)=0A=
+			trim_map_segment_add(tm, start, ts->ts_start, txg);=0A=
+		if (end > ts->ts_end)=0A=
+			trim_map_segment_add(tm, ts->ts_end, end, txg);=0A=
+		return;=0A=
+	}=0A=
+=0A=
+	ts_before =3D avl_nearest(&tm->tm_queued_frees, where, AVL_BEFORE);=0A=
+	ts_after =3D avl_nearest(&tm->tm_queued_frees, where, AVL_AFTER);=0A=
+=0A=
+	merge_before =3D (ts_before !=3D NULL && ts_before->ts_end =3D=3D =
start);=0A=
+	merge_after =3D (ts_after !=3D NULL && ts_after->ts_start =3D=3D end);=0A=
+=0A=
+	if (merge_before && merge_after) {=0A=
+		TRIM_MAP_SINC(tm, ts_after->ts_start - ts_before->ts_end);=0A=
+		TRIM_MAP_QDEC(tm);=0A=
+		avl_remove(&tm->tm_queued_frees, ts_before);=0A=
+		list_remove(&tm->tm_head, ts_before);=0A=
+		ts_after->ts_start =3D ts_before->ts_start;=0A=
+		ts_after->ts_txg =3D txg;=0A=
+		ts_after->ts_time =3D time;=0A=
+		kmem_free(ts_before, sizeof (*ts_before));=0A=
+	} else if (merge_before) {=0A=
+		TRIM_MAP_SINC(tm, end - ts_before->ts_end);=0A=
+		ts_before->ts_end =3D end;=0A=
+		ts_before->ts_txg =3D txg;=0A=
+		ts_before->ts_time =3D time;=0A=
+	} else if (merge_after) {=0A=
+		TRIM_MAP_SINC(tm, ts_after->ts_start - start);=0A=
+		ts_after->ts_start =3D start;=0A=
+		ts_after->ts_txg =3D txg;=0A=
+		ts_after->ts_time =3D time;=0A=
+	} else {=0A=
+		TRIM_MAP_SINC(tm, end - start);=0A=
+		TRIM_MAP_QINC(tm);=0A=
+		ts =3D kmem_alloc(sizeof (*ts), KM_SLEEP);=0A=
+		ts->ts_start =3D start;=0A=
+		ts->ts_end =3D end;=0A=
+		ts->ts_txg =3D txg;=0A=
+		ts->ts_time =3D time;=0A=
+		avl_insert(&tm->tm_queued_frees, ts, where);=0A=
+		list_insert_tail(&tm->tm_head, ts);=0A=
+	}=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_segment_remove(trim_map_t *tm, trim_seg_t *ts, uint64_t start,=0A=
+    uint64_t end)=0A=
+{=0A=
+	trim_seg_t *nts;=0A=
+	boolean_t left_over, right_over;=0A=
+=0A=
+	ASSERT(MUTEX_HELD(&tm->tm_lock));=0A=
+=0A=
+	left_over =3D (ts->ts_start < start);=0A=
+	right_over =3D (ts->ts_end > end);=0A=
+=0A=
+	TRIM_MAP_SDEC(tm, end - start);=0A=
+	if (left_over && right_over) {=0A=
+		nts =3D kmem_alloc(sizeof (*nts), KM_SLEEP);=0A=
+		nts->ts_start =3D end;=0A=
+		nts->ts_end =3D ts->ts_end;=0A=
+		nts->ts_txg =3D ts->ts_txg;=0A=
+		nts->ts_time =3D ts->ts_time;=0A=
+		ts->ts_end =3D start;=0A=
+		avl_insert_here(&tm->tm_queued_frees, nts, ts, AVL_AFTER);=0A=
+		list_insert_after(&tm->tm_head, ts, nts);=0A=
+		TRIM_MAP_QINC(tm);=0A=
+	} else if (left_over) {=0A=
+		ts->ts_end =3D start;=0A=
+	} else if (right_over) {=0A=
+		ts->ts_start =3D end;=0A=
+	} else {=0A=
+		avl_remove(&tm->tm_queued_frees, ts);=0A=
+		list_remove(&tm->tm_head, ts);=0A=
+		TRIM_MAP_QDEC(tm);=0A=
+		kmem_free(ts, sizeof (*ts));=0A=
+	}=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_free_locked(trim_map_t *tm, uint64_t start, uint64_t end, =
uint64_t txg)=0A=
+{=0A=
+	zio_t zsearch, *zs;=0A=
+=0A=
+	ASSERT(MUTEX_HELD(&tm->tm_lock));=0A=
+=0A=
+	zsearch.io_offset =3D start;=0A=
+	zsearch.io_size =3D end - start;=0A=
+=0A=
+	zs =3D avl_find(&tm->tm_inflight_writes, &zsearch, NULL);=0A=
+	if (zs =3D=3D NULL) {=0A=
+		trim_map_segment_add(tm, start, end, txg);=0A=
+		return;=0A=
+	}=0A=
+	if (start < zs->io_offset)=0A=
+		trim_map_free_locked(tm, start, zs->io_offset, txg);=0A=
+	if (zs->io_offset + zs->io_size < end)=0A=
+		trim_map_free_locked(tm, zs->io_offset + zs->io_size, end, txg);=0A=
+}=0A=
+=0A=
+void=0A=
+trim_map_free(vdev_t *vd, uint64_t offset, uint64_t size, uint64_t txg)=0A=
+{=0A=
+	trim_map_t *tm =3D vd->vdev_trimmap;=0A=
+=0A=
+	if (!zfs_trim_enabled || vd->vdev_notrim || tm =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+	trim_map_free_locked(tm, offset, TRIM_ZIO_END(vd, offset, size), txg);=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+}=0A=
+=0A=
+boolean_t=0A=
+trim_map_write_start(zio_t *zio)=0A=
+{=0A=
+	vdev_t *vd =3D zio->io_vd;=0A=
+	trim_map_t *tm =3D vd->vdev_trimmap;=0A=
+	trim_seg_t tsearch, *ts;=0A=
+	boolean_t left_over, right_over;=0A=
+	uint64_t start, end;=0A=
+=0A=
+	if (!zfs_trim_enabled || vd->vdev_notrim || tm =3D=3D NULL)=0A=
+		return (B_TRUE);=0A=
+=0A=
+	start =3D zio->io_offset;=0A=
+	end =3D TRIM_ZIO_END(zio->io_vd, start, zio->io_size);=0A=
+	tsearch.ts_start =3D start;=0A=
+	tsearch.ts_end =3D end;=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+=0A=
+	/*=0A=
+	 * Checking for colliding in-flight frees.=0A=
+	 */=0A=
+	ts =3D avl_find(&tm->tm_inflight_frees, &tsearch, NULL);=0A=
+	if (ts !=3D NULL) {=0A=
+		list_insert_tail(&tm->tm_pending_writes, zio);=0A=
+		mutex_exit(&tm->tm_lock);=0A=
+		return (B_FALSE);=0A=
+	}=0A=
+=0A=
+	ts =3D avl_find(&tm->tm_queued_frees, &tsearch, NULL);=0A=
+	if (ts !=3D NULL) {=0A=
+		/*=0A=
+		 * Loop until all overlapping segments are removed.=0A=
+		 */=0A=
+		do {=0A=
+			trim_map_segment_remove(tm, ts, start, end);=0A=
+			ts =3D avl_find(&tm->tm_queued_frees, &tsearch, NULL);=0A=
+		} while (ts !=3D NULL);=0A=
+	}=0A=
+	avl_add(&tm->tm_inflight_writes, zio);=0A=
+=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+=0A=
+	return (B_TRUE);=0A=
+}=0A=
+=0A=
+void=0A=
+trim_map_write_done(zio_t *zio)=0A=
+{=0A=
+	vdev_t *vd =3D zio->io_vd;=0A=
+	trim_map_t *tm =3D vd->vdev_trimmap;=0A=
+=0A=
+	/*=0A=
+	 * Don't check for vdev_notrim, since the write could have=0A=
+	 * started before vdev_notrim was set.=0A=
+	 */=0A=
+	if (!zfs_trim_enabled || tm =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+	/*=0A=
+	 * Don't fail if the write isn't in the tree, since the write=0A=
+	 * could have started after vdev_notrim was set.=0A=
+	 */=0A=
+	if (zio->io_trim_node.avl_child[0] ||=0A=
+	    zio->io_trim_node.avl_child[1] ||=0A=
+	    AVL_XPARENT(&zio->io_trim_node) ||=0A=
+	    tm->tm_inflight_writes.avl_root =3D=3D &zio->io_trim_node)=0A=
+		avl_remove(&tm->tm_inflight_writes, zio);=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+}=0A=
+=0A=
+/*=0A=
+ * Return the oldest segment (the one with the lowest txg / time) or =
NULL if:=0A=
+ * 1. The list is empty=0A=
+ * 2. The first element's txg is greater than txgsafe=0A=
+ * 3. The first element's txg is not greater than the txg argument and =
the=0A=
+ *    the first element's time is not greater than time argument=0A=
+ */=0A=
+static trim_seg_t *=0A=
+trim_map_first(trim_map_t *tm, uint64_t txg, uint64_t txgsafe, hrtime_t =
time)=0A=
+{=0A=
+	trim_seg_t *ts;=0A=
+=0A=
+	ASSERT(MUTEX_HELD(&tm->tm_lock));=0A=
+	VERIFY(txgsafe >=3D txg);=0A=
+=0A=
+	ts =3D list_head(&tm->tm_head);=0A=
+	if (ts !=3D NULL && ts->ts_txg <=3D txgsafe &&=0A=
+	    (ts->ts_txg <=3D txg || ts->ts_time <=3D time ||=0A=
+	    tm->tm_bytes > trim_vdev_max_bytes ||=0A=
+	    tm->tm_pending > trim_vdev_max_pending))=0A=
+		return (ts);=0A=
+	return (NULL);=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_vdev_commit(spa_t *spa, zio_t *zio, vdev_t *vd)=0A=
+{=0A=
+	trim_map_t *tm =3D vd->vdev_trimmap;=0A=
+	trim_seg_t *ts;=0A=
+	uint64_t size, txgtarget, txgsafe;=0A=
+	hrtime_t timelimit;=0A=
+=0A=
+	ASSERT(vd->vdev_ops->vdev_op_leaf);=0A=
+=0A=
+	if (tm =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	timelimit =3D gethrtime() - trim_timeout * NANOSEC;=0A=
+	if (vd->vdev_isl2cache) {=0A=
+		txgsafe =3D UINT64_MAX;=0A=
+		txgtarget =3D UINT64_MAX;=0A=
+	} else {=0A=
+		txgsafe =3D MIN(spa_last_synced_txg(spa), spa_freeze_txg(spa));=0A=
+		if (txgsafe > trim_txg_delay)=0A=
+			txgtarget =3D txgsafe - trim_txg_delay;=0A=
+		else=0A=
+			txgtarget =3D 0;=0A=
+	}=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+	/* Loop until we have sent all outstanding free's */=0A=
+	while ((ts =3D trim_map_first(tm, txgtarget, txgsafe, timelimit))=0A=
+	    !=3D NULL) {=0A=
+		list_remove(&tm->tm_head, ts);=0A=
+		avl_remove(&tm->tm_queued_frees, ts);=0A=
+		avl_add(&tm->tm_inflight_frees, ts);=0A=
+		size =3D ts->ts_end - ts->ts_start;=0A=
+		zio_nowait(zio_trim(zio, spa, vd, ts->ts_start, size));=0A=
+		TRIM_MAP_SDEC(tm, size);=0A=
+		TRIM_MAP_QDEC(tm);=0A=
+	}=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_vdev_commit_done(spa_t *spa, vdev_t *vd)=0A=
+{=0A=
+	trim_map_t *tm =3D vd->vdev_trimmap;=0A=
+	trim_seg_t *ts;=0A=
+	list_t pending_writes;=0A=
+	zio_t *zio;=0A=
+	uint64_t start, size;=0A=
+	void *cookie;=0A=
+=0A=
+	ASSERT(vd->vdev_ops->vdev_op_leaf);=0A=
+=0A=
+	if (tm =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	mutex_enter(&tm->tm_lock);=0A=
+	if (!avl_is_empty(&tm->tm_inflight_frees)) {=0A=
+		cookie =3D NULL;=0A=
+		while ((ts =3D avl_destroy_nodes(&tm->tm_inflight_frees,=0A=
+		    &cookie)) !=3D NULL) {=0A=
+			kmem_free(ts, sizeof (*ts));=0A=
+		}=0A=
+	}=0A=
+	list_create(&pending_writes, sizeof (zio_t), offsetof(zio_t,=0A=
+	    io_trim_link));=0A=
+	list_move_tail(&pending_writes, &tm->tm_pending_writes);=0A=
+	mutex_exit(&tm->tm_lock);=0A=
+=0A=
+	while ((zio =3D list_remove_head(&pending_writes)) !=3D NULL) {=0A=
+		zio_vdev_io_reissue(zio);=0A=
+		zio_execute(zio);=0A=
+	}=0A=
+	list_destroy(&pending_writes);=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_commit(spa_t *spa, zio_t *zio, vdev_t *vd)=0A=
+{=0A=
+	int c;=0A=
+=0A=
+	if (vd =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	if (vd->vdev_ops->vdev_op_leaf) {=0A=
+		trim_map_vdev_commit(spa, zio, vd);=0A=
+	} else {=0A=
+		for (c =3D 0; c < vd->vdev_children; c++)=0A=
+			trim_map_commit(spa, zio, vd->vdev_child[c]);=0A=
+	}=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_map_commit_done(spa_t *spa, vdev_t *vd)=0A=
+{=0A=
+	int c;=0A=
+=0A=
+	if (vd =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	if (vd->vdev_ops->vdev_op_leaf) {=0A=
+		trim_map_vdev_commit_done(spa, vd);=0A=
+	} else {=0A=
+		for (c =3D 0; c < vd->vdev_children; c++)=0A=
+			trim_map_commit_done(spa, vd->vdev_child[c]);=0A=
+	}=0A=
+}=0A=
+=0A=
+static void=0A=
+trim_thread(void *arg)=0A=
+{=0A=
+	spa_t *spa =3D arg;=0A=
+	zio_t *zio;=0A=
+=0A=
+#ifdef _KERNEL=0A=
+	(void) snprintf(curthread->td_name, sizeof(curthread->td_name),=0A=
+	    "trim %s", spa_name(spa));=0A=
+#endif=0A=
+=0A=
+	for (;;) {=0A=
+		mutex_enter(&spa->spa_trim_lock);=0A=
+		if (spa->spa_trim_thread =3D=3D NULL) {=0A=
+			spa->spa_trim_thread =3D curthread;=0A=
+			cv_signal(&spa->spa_trim_cv);=0A=
+			mutex_exit(&spa->spa_trim_lock);=0A=
+			thread_exit();=0A=
+		}=0A=
+=0A=
+		(void) cv_timedwait(&spa->spa_trim_cv, &spa->spa_trim_lock,=0A=
+		    hz * trim_max_interval);=0A=
+		mutex_exit(&spa->spa_trim_lock);=0A=
+=0A=
+		zio =3D zio_root(spa, NULL, NULL, ZIO_FLAG_CANFAIL);=0A=
+=0A=
+		spa_config_enter(spa, SCL_STATE, FTAG, RW_READER);=0A=
+		trim_map_commit(spa, zio, spa->spa_root_vdev);=0A=
+		(void) zio_wait(zio);=0A=
+		trim_map_commit_done(spa, spa->spa_root_vdev);=0A=
+		spa_config_exit(spa, SCL_STATE, FTAG);=0A=
+	}=0A=
+}=0A=
+=0A=
+void=0A=
+trim_thread_create(spa_t *spa)=0A=
+{=0A=
+=0A=
+	if (!zfs_trim_enabled)=0A=
+		return;=0A=
+=0A=
+	mutex_init(&spa->spa_trim_lock, NULL, MUTEX_DEFAULT, NULL);=0A=
+	cv_init(&spa->spa_trim_cv, NULL, CV_DEFAULT, NULL);=0A=
+	mutex_enter(&spa->spa_trim_lock);=0A=
+	spa->spa_trim_thread =3D thread_create(NULL, 0, trim_thread, spa, 0, =
&p0,=0A=
+	    TS_RUN, minclsyspri);=0A=
+	mutex_exit(&spa->spa_trim_lock);=0A=
+}=0A=
+=0A=
+void=0A=
+trim_thread_destroy(spa_t *spa)=0A=
+{=0A=
+=0A=
+	if (!zfs_trim_enabled)=0A=
+		return;=0A=
+	if (spa->spa_trim_thread =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	mutex_enter(&spa->spa_trim_lock);=0A=
+	/* Setting spa_trim_thread to NULL tells the thread to stop. */=0A=
+	spa->spa_trim_thread =3D NULL;=0A=
+	cv_signal(&spa->spa_trim_cv);=0A=
+	/* The thread will set it back to !=3D NULL on exit. */=0A=
+	while (spa->spa_trim_thread =3D=3D NULL)=0A=
+		cv_wait(&spa->spa_trim_cv, &spa->spa_trim_lock);=0A=
+	spa->spa_trim_thread =3D NULL;=0A=
+	mutex_exit(&spa->spa_trim_lock);=0A=
+=0A=
+	cv_destroy(&spa->spa_trim_cv);=0A=
+	mutex_destroy(&spa->spa_trim_lock);=0A=
+}=0A=
+=0A=
+void=0A=
+trim_thread_wakeup(spa_t *spa)=0A=
+{=0A=
+=0A=
+	if (!zfs_trim_enabled)=0A=
+		return;=0A=
+	if (spa->spa_trim_thread =3D=3D NULL)=0A=
+		return;=0A=
+=0A=
+	mutex_enter(&spa->spa_trim_lock);=0A=
+	cv_signal(&spa->spa_trim_cv);=0A=
+	mutex_exit(&spa->spa_trim_lock);=0A=
+}=0A=
=0A=
Property changes on: =
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/trim_map.c=0A=
___________________________________________________________________=0A=
Added: svn:mime-type=0A=
## -0,0 +1 ##=0A=
+text/plain=0A=
\ No newline at end of property=0A=
Added: svn:keywords=0A=
## -0,0 +1 ##=0A=
+FreeBSD=3D%H=0A=
\ No newline at end of property=0A=
Added: svn:eol-style=0A=
## -0,0 +1 ##=0A=
+native=0A=
\ No newline at end of property=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c	(working =
copy)=0A=
@@ -397,7 +397,8 @@=0A=
 dsl_free_sync(zio_t *pio, dsl_pool_t *dp, uint64_t txg, const blkptr_t =
*bpp)=0A=
 {=0A=
 	ASSERT(dsl_pool_sync_context(dp));=0A=
-	zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, pio->io_flags));=0A=
+	zio_nowait(zio_free_sync(pio, dp->dp_spa, txg, bpp, BP_GET_PSIZE(bpp),=0A=
+	    pio->io_flags));=0A=
 }=0A=
 =0A=
 static uint64_t=0A=
@@ -1364,7 +1365,7 @@=0A=
 	}=0A=
 =0A=
 	zio_nowait(zio_free_sync(scn->scn_zio_root, scn->scn_dp->dp_spa,=0A=
-	    dmu_tx_get_txg(tx), bp, 0));=0A=
+	    dmu_tx_get_txg(tx), bp, BP_GET_PSIZE(bp), 0));=0A=
 	dsl_dir_diduse_space(tx->tx_pool->dp_free_dir, DD_USED_HEAD,=0A=
 	    -bp_get_dsize_sync(scn->scn_dp->dp_spa, bp),=0A=
 	    -BP_GET_PSIZE(bp), -BP_GET_UCSIZE(bp), tx);=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c	(working =
copy)=0A=
@@ -259,7 +259,9 @@=0A=
 	size_t size;=0A=
 =0A=
 	for (c =3D 0; c < rm->rm_firstdatacol; c++) {=0A=
-		zio_buf_free(rm->rm_col[c].rc_data, rm->rm_col[c].rc_size);=0A=
+		if (rm->rm_col[c].rc_data !=3D NULL)=0A=
+			zio_buf_free(rm->rm_col[c].rc_data,=0A=
+			    rm->rm_col[c].rc_size);=0A=
 =0A=
 		if (rm->rm_col[c].rc_gdata !=3D NULL)=0A=
 			zio_buf_free(rm->rm_col[c].rc_gdata,=0A=
@@ -504,14 +506,20 @@=0A=
 	ASSERT3U(rm->rm_asize - asize, =3D=3D, rm->rm_nskip << unit_shift);=0A=
 	ASSERT3U(rm->rm_nskip, <=3D, nparity);=0A=
 =0A=
-	for (c =3D 0; c < rm->rm_firstdatacol; c++)=0A=
-		rm->rm_col[c].rc_data =3D zio_buf_alloc(rm->rm_col[c].rc_size);=0A=
+	if (zio->io_type !=3D ZIO_TYPE_FREE) {=0A=
+		for (c =3D 0; c < rm->rm_firstdatacol; c++) {=0A=
+			rm->rm_col[c].rc_data =3D=0A=
+			    zio_buf_alloc(rm->rm_col[c].rc_size);=0A=
+		}=0A=
 =0A=
-	rm->rm_col[c].rc_data =3D zio->io_data;=0A=
+		rm->rm_col[c].rc_data =3D zio->io_data;=0A=
 =0A=
-	for (c =3D c + 1; c < acols; c++)=0A=
-		rm->rm_col[c].rc_data =3D (char *)rm->rm_col[c - 1].rc_data +=0A=
-		    rm->rm_col[c - 1].rc_size;=0A=
+		for (c =3D c + 1; c < acols; c++) {=0A=
+			rm->rm_col[c].rc_data =3D=0A=
+			    (char *)rm->rm_col[c - 1].rc_data +=0A=
+			    rm->rm_col[c - 1].rc_size;=0A=
+		}=0A=
+	}=0A=
 =0A=
 	/*=0A=
 	 * If all data stored spans all columns, there's a danger that parity=0A=
@@ -1536,6 +1544,18 @@=0A=
 =0A=
 	ASSERT3U(rm->rm_asize, =3D=3D, vdev_psize_to_asize(vd, zio->io_size));=0A=
 =0A=
+	if (zio->io_type =3D=3D ZIO_TYPE_FREE) {=0A=
+		for (c =3D 0; c < rm->rm_cols; c++) {=0A=
+			rc =3D &rm->rm_col[c];=0A=
+			cvd =3D vd->vdev_child[rc->rc_devidx];=0A=
+			zio_nowait(zio_vdev_child_io(zio, NULL, cvd,=0A=
+			    rc->rc_offset, rc->rc_data, rc->rc_size,=0A=
+			    zio->io_type, zio->io_priority, 0,=0A=
+			    vdev_raidz_child_done, rc));=0A=
+		}=0A=
+		return (ZIO_PIPELINE_CONTINUE);=0A=
+	}=0A=
+=0A=
 	if (zio->io_type =3D=3D ZIO_TYPE_WRITE) {=0A=
 		vdev_raidz_generate_parity(rm);=0A=
 =0A=
@@ -1918,6 +1938,8 @@=0A=
 			zio->io_error =3D vdev_raidz_worst_error(rm);=0A=
 =0A=
 		return;=0A=
+	} else if (zio->io_type =3D=3D ZIO_TYPE_FREE) {=0A=
+		return;=0A=
 	}=0A=
 =0A=
 	ASSERT(zio->io_type =3D=3D ZIO_TYPE_READ);=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c	(working copy)=0A=
@@ -43,6 +43,7 @@=0A=
 #include <sys/arc.h>=0A=
 #include <sys/zil.h>=0A=
 #include <sys/dsl_scan.h>=0A=
+#include <sys/trim_map.h>=0A=
 =0A=
 SYSCTL_DECL(_vfs_zfs);=0A=
 SYSCTL_NODE(_vfs_zfs, OID_AUTO, vdev, CTLFLAG_RW, 0, "ZFS VDEV");=0A=
@@ -1196,6 +1197,11 @@=0A=
 	if (vd->vdev_ishole || vd->vdev_ops =3D=3D &vdev_missing_ops)=0A=
 		return (0);=0A=
 =0A=
+	if (vd->vdev_ops->vdev_op_leaf) {=0A=
+		vd->vdev_notrim =3D B_FALSE;=0A=
+		trim_map_create(vd);=0A=
+	}=0A=
+=0A=
 	for (int c =3D 0; c < vd->vdev_children; c++) {=0A=
 		if (vd->vdev_child[c]->vdev_state !=3D VDEV_STATE_HEALTHY) {=0A=
 			vdev_set_state(vd, B_TRUE, VDEV_STATE_DEGRADED,=0A=
@@ -1441,6 +1447,9 @@=0A=
 =0A=
 	vdev_cache_purge(vd);=0A=
 =0A=
+	if (vd->vdev_ops->vdev_op_leaf)=0A=
+		trim_map_destroy(vd);=0A=
+=0A=
 	/*=0A=
 	 * We record the previous state before we close it, so that if we are=0A=
 	 * doing a reopen(), we don't generate FMA ereports if we notice that=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/trim_map.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/trim_map.h	=
(revision 0)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/trim_map.h	=
(working copy)=0A=
@@ -0,0 +1,51 @@=0A=
+/*=0A=
+ * CDDL HEADER START=0A=
+ *=0A=
+ * The contents of this file are subject to the terms of the=0A=
+ * Common Development and Distribution License (the "License").=0A=
+ * You may not use this file except in compliance with the License.=0A=
+ *=0A=
+ * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE=0A=
+ * or http://www.opensolaris.org/os/licensing.=0A=
+ * See the License for the specific language governing permissions=0A=
+ * and limitations under the License.=0A=
+ *=0A=
+ * When distributing Covered Code, include this CDDL HEADER in each=0A=
+ * file and include the License file at usr/src/OPENSOLARIS.LICENSE.=0A=
+ * If applicable, add the following below this CDDL HEADER, with the=0A=
+ * fields enclosed by brackets "[]" replaced with your own identifying=0A=
+ * information: Portions Copyright [yyyy] [name of copyright owner]=0A=
+ *=0A=
+ * CDDL HEADER END=0A=
+ */=0A=
+/*=0A=
+ * Copyright (c) 2012 Pawel Jakub Dawidek <pawel@dawidek.net>.=0A=
+ * All rights reserved.=0A=
+ */=0A=
+=0A=
+#ifndef _SYS_TRIM_MAP_H=0A=
+#define	_SYS_TRIM_MAP_H=0A=
+=0A=
+#include <sys/avl.h>=0A=
+#include <sys/list.h>=0A=
+#include <sys/spa.h>=0A=
+=0A=
+#ifdef	__cplusplus=0A=
+extern "C" {=0A=
+#endif=0A=
+=0A=
+extern void trim_map_create(vdev_t *vd);=0A=
+extern void trim_map_destroy(vdev_t *vd);=0A=
+extern void trim_map_free(vdev_t *vd, uint64_t offset, uint64_t size, =
uint64_t txg);=0A=
+extern boolean_t trim_map_write_start(zio_t *zio);=0A=
+extern void trim_map_write_done(zio_t *zio);=0A=
+=0A=
+extern void trim_thread_create(spa_t *spa);=0A=
+extern void trim_thread_destroy(spa_t *spa);=0A=
+extern void trim_thread_wakeup(spa_t *spa);=0A=
+=0A=
+#ifdef	__cplusplus=0A=
+}=0A=
+#endif=0A=
+=0A=
+#endif	/* _SYS_TRIM_MAP_H */=0A=
=0A=
Property changes on: =
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/trim_map.h=0A=
___________________________________________________________________=0A=
Added: svn:mime-type=0A=
## -0,0 +1 ##=0A=
+text/plain=0A=
\ No newline at end of property=0A=
Added: svn:keywords=0A=
## -0,0 +1 ##=0A=
+FreeBSD=3D%H=0A=
\ No newline at end of property=0A=
Added: svn:eol-style=0A=
## -0,0 +1 ##=0A=
+native=0A=
\ No newline at end of property=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev.h	(working =
copy)=0A=
@@ -46,6 +46,7 @@=0A=
 } vdev_dtl_type_t;=0A=
 =0A=
 extern boolean_t zfs_nocacheflush;=0A=
+extern boolean_t zfs_trim_enabled;=0A=
 =0A=
 extern int vdev_open(vdev_t *);=0A=
 extern void vdev_open_children(vdev_t *);=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h	=
(working copy)=0A=
@@ -221,6 +221,9 @@=0A=
 	spa_proc_state_t spa_proc_state;	/* see definition */=0A=
 	struct proc	*spa_proc;		/* "zpool-poolname" process */=0A=
 	uint64_t	spa_did;		/* if procp !=3D p0, did of t1 */=0A=
+	kthread_t	*spa_trim_thread;	/* thread sending TRIM I/Os */=0A=
+	kmutex_t	spa_trim_lock;		/* protects spa_trim_cv */=0A=
+	kcondvar_t	spa_trim_cv;		/* used to notify TRIM thread */=0A=
 	boolean_t	spa_autoreplace;	/* autoreplace set in open */=0A=
 	int		spa_vdev_locks;		/* locks grabbed */=0A=
 	uint64_t	spa_creation_version;	/* version at pool creation */=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_impl.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_impl.h	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio_impl.h	=
(working copy)=0A=
@@ -130,9 +130,9 @@=0A=
 =0A=
 	ZIO_STAGE_READY			=3D 1 << 16,	/* RWFCI */=0A=
 =0A=
-	ZIO_STAGE_VDEV_IO_START		=3D 1 << 17,	/* RW--I */=0A=
-	ZIO_STAGE_VDEV_IO_DONE		=3D 1 << 18,	/* RW--- */=0A=
-	ZIO_STAGE_VDEV_IO_ASSESS	=3D 1 << 19,	/* RW--I */=0A=
+	ZIO_STAGE_VDEV_IO_START		=3D 1 << 17,	/* RWF-I */=0A=
+	ZIO_STAGE_VDEV_IO_DONE		=3D 1 << 18,	/* RWF-- */=0A=
+	ZIO_STAGE_VDEV_IO_ASSESS	=3D 1 << 19,	/* RWF-I */=0A=
 =0A=
 	ZIO_STAGE_CHECKSUM_VERIFY	=3D 1 << 20,	/* R---- */=0A=
 =0A=
@@ -214,7 +214,9 @@=0A=
 	(ZIO_INTERLOCK_STAGES |			\=0A=
 	ZIO_STAGE_FREE_BP_INIT |		\=0A=
 	ZIO_STAGE_ISSUE_ASYNC |			\=0A=
-	ZIO_STAGE_DVA_FREE)=0A=
+	ZIO_STAGE_DVA_FREE |			\=0A=
+	ZIO_STAGE_VDEV_IO_START |		\=0A=
+	ZIO_STAGE_VDEV_IO_ASSESS)=0A=
 =0A=
 #define	ZIO_DDT_FREE_PIPELINE			\=0A=
 	(ZIO_INTERLOCK_STAGES |			\=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h	=
(working copy)=0A=
@@ -183,6 +183,7 @@=0A=
 	uint64_t	vdev_unspare;	/* unspare when resilvering done */=0A=
 	hrtime_t	vdev_last_try;	/* last reopen time		*/=0A=
 	boolean_t	vdev_nowritecache; /* true if flushwritecache failed */=0A=
+	boolean_t	vdev_notrim;	/* true if trim failed */=0A=
 	boolean_t	vdev_checkremove; /* temporary online test	*/=0A=
 	boolean_t	vdev_forcefault; /* force online fault		*/=0A=
 	boolean_t	vdev_splitting;	/* split or repair in progress  */=0A=
@@ -198,6 +199,7 @@=0A=
 	spa_aux_vdev_t	*vdev_aux;	/* for l2cache vdevs		*/=0A=
 	zio_t		*vdev_probe_zio; /* root of current probe	*/=0A=
 	vdev_aux_t	vdev_label_aux;	/* on-disk aux state		*/=0A=
+	struct trim_map	*vdev_trimmap;=0A=
 =0A=
 	/*=0A=
 	 * For DTrace to work in userland (libzpool) context, these fields must=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h	(working =
copy)=0A=
@@ -32,6 +32,7 @@=0A=
 #include <sys/spa.h>=0A=
 #include <sys/txg.h>=0A=
 #include <sys/avl.h>=0A=
+#include <sys/kstat.h>=0A=
 #include <sys/fs/zfs.h>=0A=
 #include <sys/zio_impl.h>=0A=
 =0A=
@@ -137,7 +138,8 @@=0A=
 #define	ZIO_PRIORITY_RESILVER		(zio_priority_table[9])=0A=
 #define	ZIO_PRIORITY_SCRUB		(zio_priority_table[10])=0A=
 #define	ZIO_PRIORITY_DDT_PREFETCH	(zio_priority_table[11])=0A=
-#define	ZIO_PRIORITY_TABLE_SIZE		12=0A=
+#define	ZIO_PRIORITY_TRIM		(zio_priority_table[12])=0A=
+#define	ZIO_PRIORITY_TABLE_SIZE		13=0A=
 =0A=
 #define	ZIO_PIPELINE_CONTINUE		0x100=0A=
 #define	ZIO_PIPELINE_STOP		0x101=0A=
@@ -367,6 +369,39 @@=0A=
 	list_node_t	zl_child_node;=0A=
 } zio_link_t;=0A=
 =0A=
+/*=0A=
+ * Used for TRIM kstat.=0A=
+ */=0A=
+typedef struct zio_trim_stats {=0A=
+	/*=0A=
+	 * Number of bytes successfully TRIMmed.=0A=
+	 */=0A=
+	kstat_named_t bytes;=0A=
+=0A=
+	/*=0A=
+	 * Number of successful TRIM requests.=0A=
+	 */=0A=
+	kstat_named_t success;=0A=
+=0A=
+	/*=0A=
+	 * Number of TRIM requests that failed because TRIM is not=0A=
+	 * supported.=0A=
+	 */=0A=
+	kstat_named_t unsupported;=0A=
+=0A=
+	/*=0A=
+	 * Number of TRIM requests that failed for other reasons.=0A=
+	 */=0A=
+	kstat_named_t failed;=0A=
+} zio_trim_stats_t;=0A=
+=0A=
+extern zio_trim_stats_t zio_trim_stats;=0A=
+=0A=
+#define ZIO_TRIM_STAT_INCR(stat, val) \=0A=
+	atomic_add_64(&zio_trim_stats.stat.value.ui64, (val));=0A=
+#define ZIO_TRIM_STAT_BUMP(stat) \=0A=
+	ZIO_TRIM_STAT_INCR(stat, 1);=0A=
+=0A=
 struct zio {=0A=
 	/* Core information about this I/O */=0A=
 	zbookmark_t	io_bookmark;=0A=
@@ -441,6 +476,8 @@=0A=
 	/* FreeBSD only. */=0A=
 	struct ostask	io_task;=0A=
 #endif=0A=
+	avl_node_t	io_trim_node;=0A=
+	list_node_t	io_trim_link;=0A=
 };=0A=
 =0A=
 extern zio_t *zio_null(zio_t *pio, spa_t *spa, vdev_t *vd,=0A=
@@ -472,8 +509,8 @@=0A=
     zio_done_func_t *done, void *priv, enum zio_flag flags);=0A=
 =0A=
 extern zio_t *zio_ioctl(zio_t *pio, spa_t *spa, vdev_t *vd, int cmd,=0A=
-    zio_done_func_t *done, void *priv, int priority,=0A=
-    enum zio_flag flags);=0A=
+    uint64_t offset, uint64_t size, zio_done_func_t *done, void *priv,=0A=
+    int priority, enum zio_flag flags);=0A=
 =0A=
 extern zio_t *zio_read_phys(zio_t *pio, vdev_t *vd, uint64_t offset,=0A=
     uint64_t size, void *data, int checksum,=0A=
@@ -486,12 +523,14 @@=0A=
     boolean_t labels);=0A=
 =0A=
 extern zio_t *zio_free_sync(zio_t *pio, spa_t *spa, uint64_t txg,=0A=
-    const blkptr_t *bp, enum zio_flag flags);=0A=
+    const blkptr_t *bp, uint64_t size, enum zio_flag flags);=0A=
 =0A=
 extern int zio_alloc_zil(spa_t *spa, uint64_t txg, blkptr_t *new_bp,=0A=
     blkptr_t *old_bp, uint64_t size, boolean_t use_slog);=0A=
 extern void zio_free_zil(spa_t *spa, uint64_t txg, blkptr_t *bp);=0A=
 extern void zio_flush(zio_t *zio, vdev_t *vd);=0A=
+extern zio_t *zio_trim(zio_t *zio, spa_t *spa, vdev_t *vd, uint64_t =
offset,=0A=
+    uint64_t size);=0A=
 extern void zio_shrink(zio_t *zio, uint64_t size);=0A=
 =0A=
 extern int zio_wait(zio_t *zio);=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c	(working =
copy)=0A=
@@ -49,14 +49,17 @@=0A=
 =0A=
 DECLARE_GEOM_CLASS(zfs_vdev_class, zfs_vdev);=0A=
 =0A=
-/*=0A=
- * Don't send BIO_FLUSH.=0A=
- */=0A=
+SYSCTL_DECL(_vfs_zfs_vdev);=0A=
+/* Don't send BIO_FLUSH. */=0A=
 static int vdev_geom_bio_flush_disable =3D 0;=0A=
 TUNABLE_INT("vfs.zfs.vdev.bio_flush_disable", =
&vdev_geom_bio_flush_disable);=0A=
-SYSCTL_DECL(_vfs_zfs_vdev);=0A=
 SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, bio_flush_disable, CTLFLAG_RW,=0A=
     &vdev_geom_bio_flush_disable, 0, "Disable BIO_FLUSH");=0A=
+/* Don't send BIO_DELETE. */=0A=
+static int vdev_geom_bio_delete_disable =3D 0;=0A=
+TUNABLE_INT("vfs.zfs.vdev.bio_delete_disable", =
&vdev_geom_bio_delete_disable);=0A=
+SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, bio_delete_disable, CTLFLAG_RW,=0A=
+    &vdev_geom_bio_delete_disable, 0, "Disable BIO_DELETE");=0A=
 =0A=
 static void=0A=
 vdev_geom_orphan(struct g_consumer *cp)=0A=
@@ -663,8 +666,8 @@=0A=
 	*ashift =3D highbit(MAX(pp->sectorsize, SPA_MINBLOCKSIZE)) - 1;=0A=
 =0A=
 	/*=0A=
-	 * Clear the nowritecache bit, so that on a vdev_reopen() we will=0A=
-	 * try again.=0A=
+	 * Clear the nowritecache settings, so that on a vdev_reopen()=0A=
+	 * we will try again.=0A=
 	 */=0A=
 	vd->vdev_nowritecache =3D B_FALSE;=0A=
 =0A=
@@ -710,6 +713,15 @@=0A=
 		 */=0A=
 		vd->vdev_nowritecache =3D B_TRUE;=0A=
 	}=0A=
+	if (bp->bio_cmd =3D=3D BIO_DELETE && bp->bio_error =3D=3D ENOTSUP) {=0A=
+		/*=0A=
+		 * If we get ENOTSUP, we know that no future=0A=
+		 * attempts will ever succeed.  In this case we=0A=
+		 * set a persistent bit so that we don't bother=0A=
+		 * with the ioctl in the future.=0A=
+		 */=0A=
+		vd->vdev_notrim =3D B_TRUE;=0A=
+	}=0A=
 	if (zio->io_error =3D=3D EIO && !vd->vdev_remove_wanted) {=0A=
 		/*=0A=
 		 * If provider's error is set we assume it is being=0A=
@@ -752,18 +764,22 @@=0A=
 		}=0A=
 =0A=
 		switch (zio->io_cmd) {=0A=
-=0A=
 		case DKIOCFLUSHWRITECACHE:=0A=
-=0A=
 			if (zfs_nocacheflush || vdev_geom_bio_flush_disable)=0A=
 				break;=0A=
-=0A=
 			if (vd->vdev_nowritecache) {=0A=
 				zio->io_error =3D ENOTSUP;=0A=
 				break;=0A=
 			}=0A=
-=0A=
 			goto sendreq;=0A=
+		case DKIOCTRIM:=0A=
+			if (vdev_geom_bio_delete_disable)=0A=
+				break;=0A=
+			if (vd->vdev_notrim) {=0A=
+				zio->io_error =3D ENOTSUP;=0A=
+				break;=0A=
+			}=0A=
+			goto sendreq;=0A=
 		default:=0A=
 			zio->io_error =3D ENOTSUP;=0A=
 		}=0A=
@@ -787,11 +803,21 @@=0A=
 		bp->bio_length =3D zio->io_size;=0A=
 		break;=0A=
 	case ZIO_TYPE_IOCTL:=0A=
-		bp->bio_cmd =3D BIO_FLUSH;=0A=
-		bp->bio_flags |=3D BIO_ORDERED;=0A=
-		bp->bio_data =3D NULL;=0A=
-		bp->bio_offset =3D cp->provider->mediasize;=0A=
-		bp->bio_length =3D 0;=0A=
+		switch (zio->io_cmd) {=0A=
+		case DKIOCFLUSHWRITECACHE:=0A=
+			bp->bio_cmd =3D BIO_FLUSH;=0A=
+			bp->bio_flags |=3D BIO_ORDERED;=0A=
+			bp->bio_data =3D NULL;=0A=
+			bp->bio_offset =3D cp->provider->mediasize;=0A=
+			bp->bio_length =3D 0;=0A=
+			break;=0A=
+		case DKIOCTRIM:=0A=
+			bp->bio_cmd =3D BIO_DELETE;=0A=
+			bp->bio_data =3D NULL;=0A=
+			bp->bio_offset =3D zio->io_offset;=0A=
+			bp->bio_length =3D zio->io_size;=0A=
+			break;=0A=
+		}=0A=
 		break;=0A=
 	}=0A=
 	bp->bio_done =3D vdev_geom_io_intr;=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c	(working copy)=0A=
@@ -67,6 +67,7 @@=0A=
 #include <sys/dsl_userhold.h>=0A=
 #include <sys/zfeature.h>=0A=
 #include <sys/zvol.h>=0A=
+#include <sys/trim_map.h>=0A=
 =0A=
 #ifdef	_KERNEL=0A=
 #include <sys/callb.h>=0A=
@@ -1001,6 +1002,11 @@=0A=
 		spa_create_zio_taskqs(spa);=0A=
 	}=0A=
 =0A=
+	/*=0A=
+	 * Start TRIM thread.=0A=
+	 */=0A=
+	trim_thread_create(spa);=0A=
+=0A=
 	list_create(&spa->spa_config_dirty_list, sizeof (vdev_t),=0A=
 	    offsetof(vdev_t, vdev_config_dirty_node));=0A=
 	list_create(&spa->spa_state_dirty_list, sizeof (vdev_t),=0A=
@@ -1029,6 +1035,12 @@=0A=
 	ASSERT(spa->spa_async_zio_root =3D=3D NULL);=0A=
 	ASSERT(spa->spa_state !=3D POOL_STATE_UNINITIALIZED);=0A=
 =0A=
+	/*=0A=
+	 * Stop TRIM thread in case spa_unload() wasn't called directly=0A=
+	 * before spa_deactivate().=0A=
+	 */=0A=
+	trim_thread_destroy(spa);=0A=
+=0A=
 	txg_list_destroy(&spa->spa_vdev_txg_list);=0A=
 =0A=
 	list_destroy(&spa->spa_config_dirty_list);=0A=
@@ -1145,6 +1157,11 @@=0A=
 	ASSERT(MUTEX_HELD(&spa_namespace_lock));=0A=
 =0A=
 	/*=0A=
+	 * Stop TRIM thread.=0A=
+	 */=0A=
+	trim_thread_destroy(spa);=0A=
+=0A=
+	/*=0A=
 	 * Stop async tasks.=0A=
 	 */=0A=
 	spa_async_suspend(spa);=0A=
@@ -5875,7 +5892,7 @@=0A=
 	zio_t *zio =3D arg;=0A=
 =0A=
 	zio_nowait(zio_free_sync(zio, zio->io_spa, dmu_tx_get_txg(tx), bp,=0A=
-	    zio->io_flags));=0A=
+	    BP_GET_PSIZE(bp), zio->io_flags));=0A=
 	return (0);=0A=
 }=0A=
 =0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	(working copy)=0A=
@@ -130,6 +130,7 @@=0A=
 #endif=0A=
 #include <sys/callb.h>=0A=
 #include <sys/kstat.h>=0A=
+#include <sys/trim_map.h>=0A=
 #include <zfs_fletcher.h>=0A=
 #include <sys/sdt.h>=0A=
 =0A=
@@ -1691,6 +1692,8 @@=0A=
 		}=0A=
 =0A=
 		if (l2hdr !=3D NULL) {=0A=
+			trim_map_free(l2hdr->b_dev->l2ad_vdev, l2hdr->b_daddr,=0A=
+			    hdr->b_size, 0);=0A=
 			list_remove(l2hdr->b_dev->l2ad_buflist, hdr);=0A=
 			ARCSTAT_INCR(arcstat_l2_size, -hdr->b_size);=0A=
 			kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t));=0A=
@@ -3528,6 +3531,8 @@=0A=
 	buf->b_private =3D NULL;=0A=
 =0A=
 	if (l2hdr) {=0A=
+		trim_map_free(l2hdr->b_dev->l2ad_vdev, l2hdr->b_daddr,=0A=
+		    hdr->b_size, 0);=0A=
 		list_remove(l2hdr->b_dev->l2ad_buflist, hdr);=0A=
 		kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t));=0A=
 		ARCSTAT_INCR(arcstat_l2_size, -buf_size);=0A=
@@ -4442,6 +4447,8 @@=0A=
 			list_remove(buflist, ab);=0A=
 			abl2 =3D ab->b_l2hdr;=0A=
 			ab->b_l2hdr =3D NULL;=0A=
+			trim_map_free(abl2->b_dev->l2ad_vdev, abl2->b_daddr,=0A=
+			    ab->b_size, 0);=0A=
 			kmem_free(abl2, sizeof (l2arc_buf_hdr_t));=0A=
 			ARCSTAT_INCR(arcstat_l2_size, -ab->b_size);=0A=
 		}=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c	(revision =
250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c	(working copy)=0A=
@@ -35,6 +35,7 @@=0A=
 #include <sys/dmu_objset.h>=0A=
 #include <sys/arc.h>=0A=
 #include <sys/ddt.h>=0A=
+#include <sys/trim_map.h>=0A=
 =0A=
 SYSCTL_DECL(_vfs_zfs);=0A=
 SYSCTL_NODE(_vfs_zfs, OID_AUTO, zio, CTLFLAG_RW, 0, "ZFS ZIO");=0A=
@@ -43,6 +44,19 @@=0A=
 SYSCTL_INT(_vfs_zfs_zio, OID_AUTO, use_uma, CTLFLAG_RDTUN, =
&zio_use_uma, 0,=0A=
     "Use uma(9) for ZIO allocations");=0A=
 =0A=
+zio_trim_stats_t zio_trim_stats =3D {=0A=
+	{ "bytes",		KSTAT_DATA_UINT64,=0A=
+	  "Number of bytes successfully TRIMmed" },=0A=
+	{ "success",		KSTAT_DATA_UINT64,=0A=
+	  "Number of successful TRIM requests" },=0A=
+	{ "unsupported",	KSTAT_DATA_UINT64,=0A=
+	  "Number of TRIM requests that failed because TRIM is not supported" =
},=0A=
+	{ "failed",		KSTAT_DATA_UINT64,=0A=
+	  "Number of TRIM requests that failed for reasons other than not =
supported" },=0A=
+};=0A=
+=0A=
+static kstat_t *zio_trim_ksp;=0A=
+=0A=
 /*=0A=
  * =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
  * I/O priority table=0A=
@@ -61,6 +75,7 @@=0A=
 	10,	/* ZIO_PRIORITY_RESILVER	*/=0A=
 	20,	/* ZIO_PRIORITY_SCRUB		*/=0A=
 	2,	/* ZIO_PRIORITY_DDT_PREFETCH	*/=0A=
+	30,	/* ZIO_PRIORITY_TRIM		*/=0A=
 };=0A=
 =0A=
 /*=0A=
@@ -209,6 +224,16 @@=0A=
 		zfs_mg_alloc_failures =3D 8;=0A=
 =0A=
 	zio_inject_init();=0A=
+=0A=
+	zio_trim_ksp =3D kstat_create("zfs", 0, "zio_trim", "misc",=0A=
+	    KSTAT_TYPE_NAMED,=0A=
+	    sizeof(zio_trim_stats) / sizeof(kstat_named_t),=0A=
+	    KSTAT_FLAG_VIRTUAL);=0A=
+=0A=
+	if (zio_trim_ksp !=3D NULL) {=0A=
+		zio_trim_ksp->ks_data =3D &zio_trim_stats;=0A=
+		kstat_install(zio_trim_ksp);=0A=
+	}=0A=
 }=0A=
 =0A=
 void=0A=
@@ -236,6 +261,11 @@=0A=
 	kmem_cache_destroy(zio_cache);=0A=
 =0A=
 	zio_inject_fini();=0A=
+=0A=
+	if (zio_trim_ksp !=3D NULL) {=0A=
+		kstat_delete(zio_trim_ksp);=0A=
+		zio_trim_ksp =3D NULL;=0A=
+	}=0A=
 }=0A=
 =0A=
 /*=0A=
@@ -543,7 +573,7 @@=0A=
 {=0A=
 	zio_t *zio;=0A=
 =0A=
-	ASSERT3U(size, <=3D, SPA_MAXBLOCKSIZE);=0A=
+	ASSERT3U(type =3D=3D ZIO_TYPE_FREE || size, <=3D, SPA_MAXBLOCKSIZE);=0A=
 	ASSERT(P2PHASE(size, SPA_MINBLOCKSIZE) =3D=3D 0);=0A=
 	ASSERT(P2PHASE(offset, SPA_MINBLOCKSIZE) =3D=3D 0);=0A=
 =0A=
@@ -730,7 +760,7 @@=0A=
 =0A=
 zio_t *=0A=
 zio_free_sync(zio_t *pio, spa_t *spa, uint64_t txg, const blkptr_t *bp,=0A=
-    enum zio_flag flags)=0A=
+    uint64_t size, enum zio_flag flags)=0A=
 {=0A=
 	zio_t *zio;=0A=
 =0A=
@@ -743,7 +773,7 @@=0A=
 =0A=
 	metaslab_check_free(spa, bp);=0A=
 =0A=
-	zio =3D zio_create(pio, spa, txg, bp, NULL, BP_GET_PSIZE(bp),=0A=
+	zio =3D zio_create(pio, spa, txg, bp, NULL, size,=0A=
 	    NULL, NULL, ZIO_TYPE_FREE, ZIO_PRIORITY_FREE, flags,=0A=
 	    NULL, 0, NULL, ZIO_STAGE_OPEN, ZIO_FREE_PIPELINE);=0A=
 =0A=
@@ -780,15 +810,16 @@=0A=
 }=0A=
 =0A=
 zio_t *=0A=
-zio_ioctl(zio_t *pio, spa_t *spa, vdev_t *vd, int cmd,=0A=
-    zio_done_func_t *done, void *private, int priority, enum zio_flag =
flags)=0A=
+zio_ioctl(zio_t *pio, spa_t *spa, vdev_t *vd, int cmd, uint64_t offset,=0A=
+    uint64_t size, zio_done_func_t *done, void *private, int priority,=0A=
+    enum zio_flag flags)=0A=
 {=0A=
 	zio_t *zio;=0A=
 	int c;=0A=
 =0A=
 	if (vd->vdev_children =3D=3D 0) {=0A=
-		zio =3D zio_create(pio, spa, 0, NULL, NULL, 0, done, private,=0A=
-		    ZIO_TYPE_IOCTL, priority, flags, vd, 0, NULL,=0A=
+		zio =3D zio_create(pio, spa, 0, NULL, NULL, size, done, private,=0A=
+		    ZIO_TYPE_IOCTL, priority, flags, vd, offset, NULL,=0A=
 		    ZIO_STAGE_OPEN, ZIO_IOCTL_PIPELINE);=0A=
 =0A=
 		zio->io_cmd =3D cmd;=0A=
@@ -797,7 +828,7 @@=0A=
 =0A=
 		for (c =3D 0; c < vd->vdev_children; c++)=0A=
 			zio_nowait(zio_ioctl(zio, spa, vd->vdev_child[c], cmd,=0A=
-			    done, private, priority, flags));=0A=
+			    offset, size, done, private, priority, flags));=0A=
 	}=0A=
 =0A=
 	return (zio);=0A=
@@ -922,11 +953,22 @@=0A=
 void=0A=
 zio_flush(zio_t *zio, vdev_t *vd)=0A=
 {=0A=
-	zio_nowait(zio_ioctl(zio, zio->io_spa, vd, DKIOCFLUSHWRITECACHE,=0A=
+	zio_nowait(zio_ioctl(zio, zio->io_spa, vd, DKIOCFLUSHWRITECACHE, 0, 0,=0A=
 	    NULL, NULL, ZIO_PRIORITY_NOW,=0A=
 	    ZIO_FLAG_CANFAIL | ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY));=0A=
 }=0A=
 =0A=
+zio_t *=0A=
+zio_trim(zio_t *zio, spa_t *spa, vdev_t *vd, uint64_t offset, uint64_t =
size)=0A=
+{=0A=
+=0A=
+	ASSERT(vd->vdev_ops->vdev_op_leaf);=0A=
+=0A=
+	return zio_ioctl(zio, spa, vd, DKIOCTRIM, offset, size,=0A=
+	    NULL, NULL, ZIO_PRIORITY_TRIM,=0A=
+	    ZIO_FLAG_CANFAIL | ZIO_FLAG_DONT_PROPAGATE | ZIO_FLAG_DONT_RETRY);=0A=
+}=0A=
+=0A=
 void=0A=
 zio_shrink(zio_t *zio, uint64_t size)=0A=
 {=0A=
@@ -1549,6 +1591,7 @@=0A=
 zio_free_gang(zio_t *pio, blkptr_t *bp, zio_gang_node_t *gn, void *data)=0A=
 {=0A=
 	return (zio_free_sync(pio, pio->io_spa, pio->io_txg, bp,=0A=
+	    BP_IS_GANG(bp) ? SPA_GANGBLOCKSIZE : BP_GET_PSIZE(bp),=0A=
 	    ZIO_GANG_CHILD_FLAGS(pio)));=0A=
 }=0A=
 =0A=
@@ -1681,7 +1724,7 @@=0A=
 		}=0A=
 	}=0A=
 =0A=
-	if (gn =3D=3D gio->io_gang_tree)=0A=
+	if (gn =3D=3D gio->io_gang_tree && gio->io_data !=3D NULL)=0A=
 		ASSERT3P((char *)gio->io_data + gio->io_size, =3D=3D, data);=0A=
 =0A=
 	if (zio !=3D pio)=0A=
@@ -2403,7 +2446,7 @@=0A=
 =0A=
 /*=0A=
  * =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
- * Read and write to physical devices=0A=
+ * Read, write and delete to physical devices=0A=
  * =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
  */=0A=
 static int=0A=
@@ -2426,6 +2469,11 @@=0A=
 		return (vdev_mirror_ops.vdev_op_io_start(zio));=0A=
 	}=0A=
 =0A=
+	if (vd->vdev_ops->vdev_op_leaf && zio->io_type =3D=3D ZIO_TYPE_FREE) {=0A=
+		trim_map_free(vd, zio->io_offset, zio->io_size, zio->io_txg);=0A=
+		return (ZIO_PIPELINE_CONTINUE);=0A=
+	}=0A=
+=0A=
 	/*=0A=
 	 * We keep track of time-sensitive I/Os so that the scan thread=0A=
 	 * can quickly react to certain workloads.  In particular, we care=0A=
@@ -2450,18 +2498,22 @@=0A=
 =0A=
 	if (P2PHASE(zio->io_size, align) !=3D 0) {=0A=
 		uint64_t asize =3D P2ROUNDUP(zio->io_size, align);=0A=
-		char *abuf =3D zio_buf_alloc(asize);=0A=
+		char *abuf =3D NULL;=0A=
+		if (zio->io_type =3D=3D ZIO_TYPE_READ ||=0A=
+		    zio->io_type =3D=3D ZIO_TYPE_WRITE)=0A=
+			abuf =3D zio_buf_alloc(asize);=0A=
 		ASSERT(vd =3D=3D vd->vdev_top);=0A=
 		if (zio->io_type =3D=3D ZIO_TYPE_WRITE) {=0A=
 			bcopy(zio->io_data, abuf, zio->io_size);=0A=
 			bzero(abuf + zio->io_size, asize - zio->io_size);=0A=
 		}=0A=
-		zio_push_transform(zio, abuf, asize, asize, zio_subblock);=0A=
+		zio_push_transform(zio, abuf, asize, abuf ? asize : 0,=0A=
+		    zio_subblock);=0A=
 	}=0A=
 =0A=
 	ASSERT(P2PHASE(zio->io_offset, align) =3D=3D 0);=0A=
 	ASSERT(P2PHASE(zio->io_size, align) =3D=3D 0);=0A=
-	VERIFY(zio->io_type !=3D ZIO_TYPE_WRITE || spa_writeable(spa));=0A=
+	VERIFY(zio->io_type =3D=3D ZIO_TYPE_READ || spa_writeable(spa));=0A=
 =0A=
 	/*=0A=
 	 * If this is a repair I/O, and there's no self-healing involved --=0A=
@@ -2501,6 +2553,11 @@=0A=
 		}=0A=
 	}=0A=
 =0A=
+	if (vd->vdev_ops->vdev_op_leaf && zio->io_type =3D=3D ZIO_TYPE_WRITE) {=0A=
+		if (!trim_map_write_start(zio))=0A=
+			return (ZIO_PIPELINE_STOP);=0A=
+	}=0A=
+=0A=
 	return (vd->vdev_ops->vdev_op_io_start(zio));=0A=
 }=0A=
 =0A=
@@ -2514,10 +2571,17 @@=0A=
 	if (zio_wait_for_children(zio, ZIO_CHILD_VDEV, ZIO_WAIT_DONE))=0A=
 		return (ZIO_PIPELINE_STOP);=0A=
 =0A=
-	ASSERT(zio->io_type =3D=3D ZIO_TYPE_READ || zio->io_type =3D=3D =
ZIO_TYPE_WRITE);=0A=
+	ASSERT(zio->io_type =3D=3D ZIO_TYPE_READ ||=0A=
+	    zio->io_type =3D=3D ZIO_TYPE_WRITE || zio->io_type =3D=3D =
ZIO_TYPE_FREE);=0A=
 =0A=
-	if (vd !=3D NULL && vd->vdev_ops->vdev_op_leaf) {=0A=
+	if (vd !=3D NULL && vd->vdev_ops->vdev_op_leaf &&=0A=
+	    zio->io_type =3D=3D ZIO_TYPE_WRITE) {=0A=
+		trim_map_write_done(zio);=0A=
+	}=0A=
 =0A=
+	if (vd !=3D NULL && vd->vdev_ops->vdev_op_leaf &&=0A=
+	    (zio->io_type =3D=3D ZIO_TYPE_READ || zio->io_type =3D=3D =
ZIO_TYPE_WRITE)) {=0A=
+=0A=
 		vdev_queue_io_done(zio);=0A=
 =0A=
 		if (zio->io_type =3D=3D ZIO_TYPE_WRITE)=0A=
@@ -2592,6 +2656,20 @@=0A=
 	if (zio_injection_enabled && zio->io_error =3D=3D 0)=0A=
 		zio->io_error =3D zio_handle_fault_injection(zio, EIO);=0A=
 =0A=
+	if (zio->io_type =3D=3D ZIO_TYPE_IOCTL && zio->io_cmd =3D=3D DKIOCTRIM)=0A=
+		switch (zio->io_error) {=0A=
+		case 0:=0A=
+			ZIO_TRIM_STAT_INCR(bytes, zio->io_size);=0A=
+			ZIO_TRIM_STAT_BUMP(success);=0A=
+			break;=0A=
+		case EOPNOTSUPP:=0A=
+			ZIO_TRIM_STAT_BUMP(unsupported);=0A=
+			break;=0A=
+		default:=0A=
+			ZIO_TRIM_STAT_BUMP(failed);=0A=
+			break;=0A=
+		}=0A=
+=0A=
 	/*=0A=
 	 * If the I/O failed, determine whether we should attempt to retry it.=0A=
 	 *=0A=
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c	=
(revision 250526)=0A=
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_label.c	(working =
copy)=0A=
@@ -145,8 +145,14 @@=0A=
 #include <sys/metaslab.h>=0A=
 #include <sys/zio.h>=0A=
 #include <sys/dsl_scan.h>=0A=
+#include <sys/trim_map.h>=0A=
 #include <sys/fs/zfs.h>=0A=
 =0A=
+static boolean_t vdev_trim_on_init =3D B_TRUE;=0A=
+SYSCTL_DECL(_vfs_zfs_vdev);=0A=
+SYSCTL_INT(_vfs_zfs_vdev, OID_AUTO, trim_on_init, CTLFLAG_RW,=0A=
+    &vdev_trim_on_init, 0, "Enable/disable full vdev trim on =
initialisation");=0A=
+=0A=
 /*=0A=
  * Basic routines to read and write from a vdev label.=0A=
  * Used throughout the rest of this file.=0A=
@@ -718,6 +724,16 @@=0A=
 	}=0A=
 =0A=
 	/*=0A=
+	 * TRIM the whole thing so that we start with a clean slate.=0A=
+	 * It's just an optimization, so we don't care if it fails.=0A=
+	 * Don't TRIM if removing so that we don't interfere with zpool=0A=
+	 * disaster recovery.=0A=
+	 */=0A=
+	if (zfs_trim_enabled && vdev_trim_on_init && (reason =3D=3D =
VDEV_LABEL_CREATE ||=0A=
+	    reason =3D=3D VDEV_LABEL_SPARE || reason =3D=3D =
VDEV_LABEL_L2CACHE))=0A=
+		zio_wait(zio_trim(NULL, spa, vd, 0, vd->vdev_psize));=0A=
+=0A=
+	/*=0A=
 	 * Initialize its label.=0A=
 	 */=0A=
 	vp =3D zio_buf_alloc(sizeof (vdev_phys_t));=0A=
@@ -1282,5 +1298,10 @@=0A=
 	 * to disk to ensure that all odd-label updates are committed to=0A=
 	 * stable storage before the next transaction group begins.=0A=
 	 */=0A=
-	return (vdev_label_sync_list(spa, 1, txg, flags));=0A=
+	if ((error =3D vdev_label_sync_list(spa, 1, txg, flags)) !=3D 0)=0A=
+		return (error);=0A=
+=0A=
+	trim_thread_wakeup(spa);=0A=
+=0A=
+	return (0);=0A=
 }=0A=
Index: sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c	(revision =
250526)=0A=
+++ sys/cddl/compat/opensolaris/kern/opensolaris_kstat.c	(working copy)=0A=
@@ -118,7 +118,7 @@=0A=
 		SYSCTL_ADD_PROC(&ksp->ks_sysctl_ctx,=0A=
 		    SYSCTL_CHILDREN(ksp->ks_sysctl_root), OID_AUTO, ksent->name,=0A=
 		    CTLTYPE_U64 | CTLFLAG_RD, ksent, sizeof(*ksent),=0A=
-		    kstat_sysctl, "QU", "");=0A=
+		    kstat_sysctl, "QU", ksent->desc);=0A=
 	}=0A=
 }=0A=
 =0A=
Index: sys/cddl/compat/opensolaris/sys/dkio.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/compat/opensolaris/sys/dkio.h	(revision 250526)=0A=
+++ sys/cddl/compat/opensolaris/sys/dkio.h	(working copy)=0A=
@@ -75,6 +75,8 @@=0A=
  */=0A=
 #define	DKIOCFLUSHWRITECACHE	(DKIOC|34)	/* flush cache to phys medium */=0A=
 =0A=
+#define	DKIOCTRIM		(DKIOC|35)	/* TRIM a block */=0A=
+=0A=
 struct dk_callback {=0A=
 	void (*dkc_callback)(void *dkc_cookie, int error);=0A=
 	void *dkc_cookie;=0A=
Index: sys/cddl/compat/opensolaris/sys/time.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/compat/opensolaris/sys/time.h	(revision 250526)=0A=
+++ sys/cddl/compat/opensolaris/sys/time.h	(working copy)=0A=
@@ -35,6 +35,7 @@=0A=
 #define MILLISEC	1000=0A=
 #define MICROSEC	1000000=0A=
 #define NANOSEC		1000000000=0A=
+#define TIME_MAX	LLONG_MAX=0A=
 =0A=
 typedef longlong_t	hrtime_t;=0A=
 =0A=
Index: sys/cddl/compat/opensolaris/sys/kstat.h=0A=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=0A=
--- sys/cddl/compat/opensolaris/sys/kstat.h	(revision 250526)=0A=
+++ sys/cddl/compat/opensolaris/sys/kstat.h	(working copy)=0A=
@@ -53,6 +53,8 @@=0A=
 #define	KSTAT_DATA_INT64	3=0A=
 #define	KSTAT_DATA_UINT64	4=0A=
 	uchar_t	data_type;=0A=
+#define	KSTAT_DESCLEN		128=0A=
+	char	desc[KSTAT_DESCLEN];=0A=
 	union {=0A=
 		uint64_t	ui64;=0A=
 	} value;=0A=

------=_NextPart_000_0391_01CE581A.96C6FBA0--




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9219DF6C2DFD422998FF9B5526A06EF1>