Date: Mon, 6 Dec 2010 21:28:42 +0100 From: Ivan Voras <ivoras@freebsd.org> To: Pawel Jakub Dawidek <pjd@freebsd.org> Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: Re: svn commit: r216230 - head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <AANLkTinUjzrUWc9Xn-iQYmSMZKVxCw9G-0HH868YonMy@mail.gmail.com> In-Reply-To: <20101206195327.GD1936@garage.freebsd.pl> References: <201012061218.oB6CI3oW032770@svn.freebsd.org> <20101206184453.GA1936@garage.freebsd.pl> <20101206192238.GB1936@garage.freebsd.pl> <AANLkTine9rGq_cM4ruFXYq=-F7cMXcQAr-zKHuWoQs2z@mail.gmail.com> <20101206195327.GD1936@garage.freebsd.pl>
next in thread | previous in thread | raw e-mail | index | archive | help
Firstly, thank you, your explanations and questions are very good and address the problems in a good way to discuss about it. I respect that you took the time to write this answer! On 6 December 2010 20:53, Pawel Jakub Dawidek <pjd@freebsd.org> wrote: > On Mon, Dec 06, 2010 at 08:35:36PM +0100, Ivan Voras wrote: >> Please persuade me on technical grounds why ashift, a property >> intended for address alignment, should not be set in this way. If your >> answer is "I don't know but you are still wrong because I say so" I >> will respect it and back it out but only until I/we discuss the >> question with upstream ZFS developers. > > No. You persuade me why changing ashift in ZFS, which, as the comment > clearly states is "device's minimum transfer size" is better and not > hackish than presenting the disk with properly configured sector size. You are right, it was, at least originally, defined as the device's minimum transfer size, and it is in this context hackish to change it to something that isn't it. But there are two reasons that I think are important, which resulted in changing this: 1) It is being used out of the original context in the mailing list posts I've referenced - it was being used (and in a worse way, by having a hacked zpool binary) for alignment like I did in the patch, despite that the sectorsize wasn't changed. 2) The solaris "big sector" project, described in http://arc.opensolaris.org/caselog/PSARC/2008/769/final_spec.txt basically blesses ZFS to work with big sectors, and states "ZFS can automatically run on larger sector size disk after the label change and I/O path change." Since the effect of having a larger sectorsize will effectively be only the change in ashift, this is what I've done. I would of course also be happy with simply having drives with sectorsize=4096, but I've given up on this route because of two reasons: mav was essentially against it since it will break compatibility if suddenly changed and the posts on opensolaris mailing lists are basically saying that it is likely that the 512/4096 kludge will remain "forever" and it's unlikely to change. > This can not only affect disks that still use 512 bytes sectors, but > doesn't fix the problem at all. It just works around the problem in ZFS > when configured on top of raw disks. I'm not sure how to parse "can not only affect disks that still use 512 byte sectors" - if you are saying that it can affect disks with 512 byte sectors, I can think of only one case when it can have a serious effect: if the drive (or GEOM) suddenly use 4 KiB sectorsize on a drive which was previously formatted with ZFS as a drive with 512 byte sectors. All other combinations work because ashift is a part of ZFS metadata - this calculation is overriden in case e.g. a zvol created with ashift=12 is imported on a kernel/OS which only sees 512 byte sectors. > What about other file systems? What about other GEOM classes? GELI is > great example here, as people use ZFS on top of GELI alot. GELI > integrity verification works in a way that not reporting disk sector > size properly will have huge negative performance impact. ZFS' ashift > won't change that. Does this mean that, despite that ZFS is correctly aligned, GELI can break this alignment policy so still, at the end, produce wrong alignments for IOs? If so, I understand your concern, but it looks like a problem with GELI, not ZFS. The same problem will occur if any other file system type is created which is "properly aligned" from the POW of the administrator. > So you should back this change out, provide technical arguments (if they > exist) that this is the right solution to the problem and not "hey, here > is a patch, I think it is ok". Thank you, I will try to remember to give more details in commits. > BTW. ZFS is no longer open-source if you didn't notice. I have noticed but I don't understand how it affects FreeBSD. I would like to discuss this sometime but unless you have some urgent interpretation of this information, I think it's best to start another thread about it. I think that, basically, the whole argument boils down to the question like "while ZFS will most likely implement big sector support in this way, it doesn't right now, so should we do it ourselves?" I hope that this information I've written helps explain why I did the patch. If you are still against it, I will back it out.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinUjzrUWc9Xn-iQYmSMZKVxCw9G-0HH868YonMy>