From owner-freebsd-fs@freebsd.org Tue Apr 9 17:53:24 2019 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 56C5015676ED for ; Tue, 9 Apr 2019 17:53:24 +0000 (UTC) (envelope-from rpokala@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id EB6C070D98 for ; Tue, 9 Apr 2019 17:53:23 +0000 (UTC) (envelope-from rpokala@freebsd.org) Received: from [192.168.1.15] (c-71-198-162-232.hsd1.ca.comcast.net [71.198.162.232]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) (Authenticated sender: rpokala) by smtp.freebsd.org (Postfix) with ESMTPSA id A362EF086 for ; Tue, 9 Apr 2019 17:53:23 +0000 (UTC) (envelope-from rpokala@freebsd.org) User-Agent: Microsoft-MacOutlook/10.17.1.190326 Date: Tue, 09 Apr 2019 10:53:20 -0700 Subject: Re: about zfs and ashift and changing ashift on existing zpool From: Ravi Pokala To: "freebsd-fs@freebsd.org" Message-ID: Thread-Topic: about zfs and ashift and changing ashift on existing zpool Mime-version: 1.0 Content-type: text/plain; charset="UTF-8" Content-transfer-encoding: 7bit X-Rspamd-Queue-Id: EB6C070D98 X-Spamd-Bar: -- Authentication-Results: mx1.freebsd.org X-Spamd-Result: default: False [-2.97 / 15.00]; local_wl_from(0.00)[freebsd.org]; NEURAL_HAM_MEDIUM(-0.99)[-0.993,0]; NEURAL_HAM_SHORT(-0.97)[-0.973,0]; NEURAL_HAM_LONG(-1.00)[-1.000,0]; ASN(0.00)[asn:11403, ipnet:2610:1c1:1::/48, country:US] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 09 Apr 2019 17:53:24 -0000 -----Original Message----- Date: Mon, 8 Apr 2019 20:00:09 -0400 From: "Kevin P. Neal" To: Peter Jeremy Cc: tech-lists , freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: about zfs and ashift and changing ashift on existing zpool Message-ID: <20190409000009.GA65388@neutralgood.org> Content-Type: text/plain; charset=us-ascii ... > Be careful with that. I've got a new pool I made with 4Kn drives that ZFS > failed to detect were 4Kn drives. The resulting ashift was 9, but I did > catch it early enough to avoid too much pain. Still, it shouldn't have > happened. I've since set the sysctl so it doesn't accidentally happen in > the future. ... > Logical block size: 512 bytes > Physical block size: 4096 bytes That is *NOT* an AF-4Kn drive, it is an AF-512e drive. To refresh everyone's memory: XXXn: "native" - logical block size == physical block size XXXe: "emulated" - logical block size < physical block size 512n: logical block size = 512B; physical block size = 512B - Retronym; basically everything commercially available before 2013 was 512n AF-512e: logical block size = 512B; physical block size > 512B - Uses a larger physical block size for greater media encoding efficiency - Retains the same 512B logical block size that software has been expecting for decades - Firmware performs read/modify/write as needed for changes smaller than the physical block size - I have only seen AF-512e drives with physical block size 4KB, but apparently there are some with even larger physical blocks out / on the horizon. AF-4Kn: logical block size = 4KB; physical block size = 4KB - Uses a 4KB physical block size for greater media encoding efficiency - Uses a 4KB logical block size, to avoid firmware having to do read/modify/write - Requires software to be updated to perform logical-block-sized operations AF-4Ke: logical block size = 4KB; physical block size > 4KB - I have not seen one of these, but I have heard references to them. And then there are SSDs, which might report 512B or 4KB as either their logical or physical block sizes, but which actually use much larger NAND program blocks, and even larger NAND erase blocks. But everything about SSD "geometry" is a lie anyway. Also: the FreeBSD GEOM stack is aware of the differences between physical and logical block sizes. For a drive device, GEOM reports the logical block size as the "sector" size, and the physical block size as the "stripe" size. I **think** some (all?) of the in-tree GEOMs will prefer to use "stripe"-sized IOs if possible, falling back to "sector"-sized IOs if necessary. Thanks, Ravi (rpokala@)