From owner-freebsd-fs@FreeBSD.ORG Tue Dec 21 15:02:49 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 329F21065672 for ; Tue, 21 Dec 2010 15:02:49 +0000 (UTC) (envelope-from am@raisa.eu.org) Received: from raisa.eu.org (raisa.eu.org [83.17.178.202]) by mx1.freebsd.org (Postfix) with ESMTP id 7ABF98FC12 for ; Tue, 21 Dec 2010 15:02:48 +0000 (UTC) Received: from arrow (127-goc-33.acn.waw.pl [94.75.108.127]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by raisa.eu.org (Postfix) with ESMTP id 30FB813A for ; Tue, 21 Dec 2010 15:26:30 +0100 (CET) Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Date: Tue, 21 Dec 2010 15:29:01 +0100 To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: "Emil Smolenski" Message-ID: User-Agent: Opera Mail/11.00 (FreeBSD) Subject: [ZFS] Booting from zpool created on 4k-sector drive X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Dec 2010 15:02:49 -0000 Hello, There is a hack to force zpool creation with minimum sector size equal to 4k: # gnop create -S 4096 ${DEV0} # zpool create tank ${DEV0}.nop # zpool export tank # gnop destroy ${DEV0}.nop # zpool import tank Zpool created this way is much faster on problematic 4k sector drives which lies about its sector size (like WD EARS). This hack works perfectly fine when system is running. Gnop layer is created only for "zpool create" command -- ZFS stores information about sector size in its metadata. After zpool creation one can export the pool, remove gnop layer and reimport the pool. Difference can be seen in the output from the zdb command: - on 512 sector device (2**9 = 512): % zdb tank |grep ashift ashift=9 - on 4096 sector device (2**12 = 4096): % zdb tank |grep ashift ashift=12 This change is permanent. The only possibility to change the value of ashift is: zpool destroy/create and restoring pool from backup. But there is one problem: I cannot boot from such pool. Error message: ZFS: i/o error - all block copies unavailable ZFS: can't read MOS ZFS: unexpected object set type 0 This is standard configuration with GPT scheme. # gpart show da0 => 34 2930211565 da0 GPT (1.4T) 34 30 - free - (15K) 64 128 1 freebsd-boot (64K) 192 4194304 2 freebsd-swap (2.0G) 4194496 8388608 3 freebsd-zfs (4.0G) 12583104 2917628495 - free - (1.4T) # zpool status tank pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 gpt/tank0 ONLINE 0 0 0 # zdb -uuu tank Uberblock magic = 0000000000bab10c version = 15 txg = 2838 guid_sum = 12371721502612965633 timestamp = 1292860198 UTC = Mon Dec 20 15:49:58 2010 rootbp = [L0 DMU objset] 800L/200P DVA[0]=<0:2041000:1000> DVA[1]=<0:30062000:1000> DVA[2]=<0:ee0bd000:1000> fletcher4 lzjb LE contiguous birth=2838 fill=374 cksum=c9605617d:4e2cf0a8c94:f6decb77086a:210752c3aee4a8 There is FreeBSD 8.2-PRERELEASE installed. This is output from my lame pseudo-debug code. Format is "_func_:returncode". Examined file: zfsimpl.c: (...) vdev_probe:5 vdev_read_phys:0 nvlist_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:5 spa_find_by_guid:0 spa_create:3621543968 nvlist_find:0 vdev_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:0 nvlist_find:5 nvlist_find:5 nvlist_find:5 nvlist_find:5 nvlist_find:5 vdev_find:0 vdev_create:3621548269 nvlist_find:0 nvlist_find:5 nvlist_find:0 nvlist_find:5 vdev_init_from_nvlist:0 vdev_find:3621548269 vdev_read_phys:5 #(condition: (bp && zio_checksum_error(bp, buf))) vdev_read_phys:5 #(condition: (bp && zio_checksum_error(bp, buf))) (...) #(many times) vdev_read_phys:5 #(condition: (bp && zio_checksum_error(bp, buf))) vdev_read_phys:5 #(condition: (bp && zio_checksum_error(bp, buf))) vdev_probe:0 zfs_alloc_temp:3620233248 ZFS: i/o error - all block copies unavailable zio_read:5 ZFS: can't read MOS zfs_mount_pool:5 ZFS: unexpected object set type 0 zfs_lookup:5 ZFS: unexpected object set type 0 zfs_lookup:5 (...) I don't know whether this information is useful. If it is not, please provide me patches with more suitable debug code. Thanks! Background information: IMO this issue is critical. Almost all >=2TB disks have 4k sector size nowadays. In the near future (>=3TB?) all disks will have 4k sectors. If I create zpool on 512 sector disks I won't be able to attach new 4k disks to it: # zpool create tank ${DEV0} # gnop create -S 4096 ${DEV1} # zpool attach tank ${DEV0} ${DEV1}.nop cannot attach ${DEV1}.nop to ${DEV0}: devices have different sector alignment Probably disks with 512 sector size will soon disappear from the market. Thus, there is need to be able to create _bootable_ zpool with 4k sector size even on 512 sector size disks. Then we can attach both 512 and 4k sector disks: # gnop create -S 4096 ${DEV1} # zpool create tank ${DEV1}.nop # zpool attach tank ${DEV1}.nop ${DEV0} # zpool status tank pool: tank state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Fri Dec 17 15:47:28 2010 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 ${DEV1}.nop ONLINE 0 0 0 ${DEV0} ONLINE 0 0 0 448K resilvered Problem with booting prevents me from using this workaround. -- am