From owner-svn-src-all@FreeBSD.ORG Mon Dec 6 20:29:23 2010 Return-Path: Delivered-To: svn-src-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C64E21065673; Mon, 6 Dec 2010 20:29:23 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 39E6F8FC1A; Mon, 6 Dec 2010 20:29:22 +0000 (UTC) Received: by qyk8 with SMTP id 8so3693543qyk.13 for ; Mon, 06 Dec 2010 12:29:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:sender:received :in-reply-to:references:from:date:x-google-sender-auth:message-id :subject:to:cc:content-type; bh=xbr8q+ENHC6yNo2lsB5L/cS8T9QqWfkgbvoYbzCS0xg=; b=s+ZtcLzD48gHkRYknEyDI8Ai2hSIwmUZYpWAe9CsbqDx4uoKJhnqXaEMdCEFiSlF2e M5ONRdWoA/lihJXY77TRjT1e9nvqeQ6pNoqxgk6+qvruBnBWsi+a2DdeXdM1LiZV47Zc TNn4rKLDxe4quMSjKw/FUvuD+1iLQSahUcLsA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; b=MlhB14xuq48xoOmSlAw8W8f7nXuDxAssXOurWGxa6DYb0i0PQbr1LW98Rtzw258l8J 7oPyFl/1937CFy1ToT2joReTCP65LROd6umGYZ5pA71t9QUFoXpX3qzjpTgeGSec2JYf cM6ue7D9WMrbkigNQdOaqAi2XegGF0SntUVA8= Received: by 10.229.82.70 with SMTP id a6mr4946901qcl.9.1291667362442; Mon, 06 Dec 2010 12:29:22 -0800 (PST) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.229.231.143 with HTTP; Mon, 6 Dec 2010 12:28:42 -0800 (PST) In-Reply-To: <20101206195327.GD1936@garage.freebsd.pl> References: <201012061218.oB6CI3oW032770@svn.freebsd.org> <20101206184453.GA1936@garage.freebsd.pl> <20101206192238.GB1936@garage.freebsd.pl> <20101206195327.GD1936@garage.freebsd.pl> From: Ivan Voras Date: Mon, 6 Dec 2010 21:28:42 +0100 X-Google-Sender-Auth: bV3NGUlTNPbUqIRoq7dxS7q6UjE Message-ID: To: Pawel Jakub Dawidek Content-Type: text/plain; charset=UTF-8 Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r216230 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Dec 2010 20:29:23 -0000 Firstly, thank you, your explanations and questions are very good and address the problems in a good way to discuss about it. I respect that you took the time to write this answer! On 6 December 2010 20:53, Pawel Jakub Dawidek wrote: > On Mon, Dec 06, 2010 at 08:35:36PM +0100, Ivan Voras wrote: >> Please persuade me on technical grounds why ashift, a property >> intended for address alignment, should not be set in this way. If your >> answer is "I don't know but you are still wrong because I say so" I >> will respect it and back it out but only until I/we discuss the >> question with upstream ZFS developers. > > No. You persuade me why changing ashift in ZFS, which, as the comment > clearly states is "device's minimum transfer size" is better and not > hackish than presenting the disk with properly configured sector size. You are right, it was, at least originally, defined as the device's minimum transfer size, and it is in this context hackish to change it to something that isn't it. But there are two reasons that I think are important, which resulted in changing this: 1) It is being used out of the original context in the mailing list posts I've referenced - it was being used (and in a worse way, by having a hacked zpool binary) for alignment like I did in the patch, despite that the sectorsize wasn't changed. 2) The solaris "big sector" project, described in http://arc.opensolaris.org/caselog/PSARC/2008/769/final_spec.txt basically blesses ZFS to work with big sectors, and states "ZFS can automatically run on larger sector size disk after the label change and I/O path change." Since the effect of having a larger sectorsize will effectively be only the change in ashift, this is what I've done. I would of course also be happy with simply having drives with sectorsize=4096, but I've given up on this route because of two reasons: mav was essentially against it since it will break compatibility if suddenly changed and the posts on opensolaris mailing lists are basically saying that it is likely that the 512/4096 kludge will remain "forever" and it's unlikely to change. > This can not only affect disks that still use 512 bytes sectors, but > doesn't fix the problem at all. It just works around the problem in ZFS > when configured on top of raw disks. I'm not sure how to parse "can not only affect disks that still use 512 byte sectors" - if you are saying that it can affect disks with 512 byte sectors, I can think of only one case when it can have a serious effect: if the drive (or GEOM) suddenly use 4 KiB sectorsize on a drive which was previously formatted with ZFS as a drive with 512 byte sectors. All other combinations work because ashift is a part of ZFS metadata - this calculation is overriden in case e.g. a zvol created with ashift=12 is imported on a kernel/OS which only sees 512 byte sectors. > What about other file systems? What about other GEOM classes? GELI is > great example here, as people use ZFS on top of GELI alot. GELI > integrity verification works in a way that not reporting disk sector > size properly will have huge negative performance impact. ZFS' ashift > won't change that. Does this mean that, despite that ZFS is correctly aligned, GELI can break this alignment policy so still, at the end, produce wrong alignments for IOs? If so, I understand your concern, but it looks like a problem with GELI, not ZFS. The same problem will occur if any other file system type is created which is "properly aligned" from the POW of the administrator. > So you should back this change out, provide technical arguments (if they > exist) that this is the right solution to the problem and not "hey, here > is a patch, I think it is ok". Thank you, I will try to remember to give more details in commits. > BTW. ZFS is no longer open-source if you didn't notice. I have noticed but I don't understand how it affects FreeBSD. I would like to discuss this sometime but unless you have some urgent interpretation of this information, I think it's best to start another thread about it. I think that, basically, the whole argument boils down to the question like "while ZFS will most likely implement big sector support in this way, it doesn't right now, so should we do it ourselves?" I hope that this information I've written helps explain why I did the patch. If you are still against it, I will back it out.