Date: Wed, 17 Sep 2014 07:24:51 -0600 From: "Justin T. Gibbs" <gibbs@FreeBSD.org> To: Andriy Gapon <avg@FreeBSD.org> Cc: svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, Xin LI <delphij@FreeBSD.org> Subject: Re: svn commit: r269093 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <75BE5420-586F-4BDA-9D5C-F46A114E84EE@FreeBSD.org> In-Reply-To: <5419648D.2030305@FreeBSD.org> References: <201407251841.s6PIfvUm002202@svn.freebsd.org> <5419648D.2030305@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sep 17, 2014, at 4:38 AM, Andriy Gapon <avg@FreeBSD.org> wrote: > On 25/07/2014 21:41, Xin LI wrote: >> Author: delphij >> Date: Fri Jul 25 18:41:56 2014 >> New Revision: 269093 >> URL: http://svnweb.freebsd.org/changeset/base/269093 >>=20 >> Log: >> Transform the I/O when vdev_physical_ashift is greater than >> SPA_MINBLOCKSHIFT. >=20 > This commit seems illogical to me. > I do not see why a fact that the _physical_ sector size is greater = than 512B > (e.g. 4KB) should force a data buffer of a "physical" zio to be = rounded up using > ashift. > To give an example. ZFS knows that a vdev has physical sector size of = 4KB, but > a logical sector size of 512B. ZFS uses ashift of 12 for the vdev. > Nevertheless in some situations ZFS wants to make a physical / special = write of > 512B (taking advantage of RMW). Why should we force that write to = become 4KB > with data beyond the first 512B being all zeros? >=20 > It would be logical to check for a logical sector size and to enforce = it, but > that is not what this commit does. >=20 > This commit is against the spirit of commit r268855. It was = specifically to > allow ZFS physical writes to have smaller alignments than otherwise = would be > dictated by ashift. I too don=92t see the motivation behind this commit unless it was an = attempt to get 4K native devices to work (i.e. devices that fail, by = design, accesses smaller than 4K). If this is the case, this is not the = correct fix. Since the transform doesn=92t do RMW, it seems like this = can also cause data loss. =20 >=20 > I think that this commit should be reverted. If ZFS wants to make a = physical > I/O operation with a particular alignment then it must be allowed to. = If that > operation is invalid for a certain vdev, then it is higher level code = that must > be modified to account for the logical sector size. The physical zio = must be > _created_ correctly, it should not be silently transformed. vdev_logical_ashift must be consulted when performing ZIO_FLAG_PHYSICAL = I/O instead of assuming vdev_logical_ashift is 9. (It is unfortunate = that the zio flag is named this way since the I/O is *not* constrained = by the physical characteristics of the media, but rather by the logical = restrictions of the software running on it.) r268855 doesn=92t do this = because illumos doesn=92t (yet?) have this information in their vdev = structure. However, we=92ve had it since r254591 so the MFV in r268855 = wasn=92t complete. Catching the misalignment in zio.c is more = user/developer friendly than having the CAM layer return EIO and = convince ZFS that this is a media failure. But, as Andriy points out, = the upper level code that might issue these I/Os must be changed to = understand =93logical sector size=94. =97 Justin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?75BE5420-586F-4BDA-9D5C-F46A114E84EE>