From owner-freebsd-current@freebsd.org Fri Jun 29 02:47:25 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 3FD48102FE9A for ; Fri, 29 Jun 2018 02:47:25 +0000 (UTC) (envelope-from kiri@kx.openedu.org) Received: from kx.openedu.org (flets-sg1027.kamome.or.jp [202.216.24.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 34261788E8; Fri, 29 Jun 2018 02:47:23 +0000 (UTC) (envelope-from kiri@kx.openedu.org) Received: from kx.openedu.org (kx.openedu.org [202.216.24.27]) by kx.openedu.org (8.14.5/8.14.5) with ESMTP id w5T2lDd6065483; Fri, 29 Jun 2018 11:47:13 +0900 (JST) (envelope-from kiri@kx.openedu.org) Message-Id: <201806290247.w5T2lDd6065483@kx.openedu.org> Date: Fri, 29 Jun 2018 11:47:13 +0900 From: KIRIYAMA Kazuhiko To: Toomas Soome Cc: KIRIYAMA Kazuhiko , Allan Jude , freebsd-current@freebsd.org Subject: Re: ZFS: I/O error - blocks larger than 16777216 are not supported In-Reply-To: <63C1AB52-1A4B-430E-9D88-6406107785BA@me.com> References: <201806210136.w5L1a5Nv074194@kx.openedu.org> <21493592-4eb2-59c5-1b0d-e1d08217a96b@freebsd.org> <201806210600.w5L60mYn079435@kx.openedu.org> <1CDD5AFE-F115-406C-AB92-9DC58B57E1D5@me.com> <201806260208.w5Q28Una093666@kx.openedu.org> <63C1AB52-1A4B-430E-9D88-6406107785BA@me.com> User-Agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?ISO-8859-4?Q?Goj=F2?=) APEL/10.6 MULE XEmacs/21.4 (patch 22) (Instant Classic) (amd64--freebsd) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Jun 2018 02:47:25 -0000 At Tue, 26 Jun 2018 09:48:10 +0300, Toomas Soome wrote: > > > > > On 26 Jun 2018, at 05:08, KIRIYAMA Kazuhiko wrote: > > > > At Thu, 21 Jun 2018 10:48:28 +0300, > > Toomas Soome wrote: > >> > >> > >> > >>> On 21 Jun 2018, at 09:00, KIRIYAMA Kazuhiko wrote: > >>> > >>> At Wed, 20 Jun 2018 23:34:48 -0400, > >>> Allan Jude wrote: > >>>> > >>>> On 2018-06-20 21:36, KIRIYAMA Kazuhiko wrote: > >>>>> Hi all, > >>>>> > >>>>> I've been reported ZFS boot disable problem [1], and found > >>>>> that this issue occers form RAID configuration [2]. So I > >>>>> rebuit with RAID5 and re-installed 12.0-CURRENT > >>>>> (r333982). But failed to boot with: > >>>>> > >>>>> ZFS: i/o error - all block copies unavailable > >>>>> ZFS: can't read MOS of pool zroot > >>>>> gptzfsboot: failed to mount default pool zroot > >>>>> > >>>>> FreeBSD/x86 boot > >>>>> ZFS: I/O error - blocks larger than 16777216 are not supported > >>>>> ZFS: can't find dataset u > >>>>> Default: zroot/<0x0>: > >>>>> > >>>>> In this case, the reason is "blocks larger than 16777216 are > >>>>> not supported" and I guess this means datasets that have > >>>>> recordsize greater than 8GB is NOT supported by the > >>>>> FreeBSD boot loader(zpool-features(7)). Is that true ? > >>>>> > >>>>> My zpool featues are as follows: > >>>>> > >>>>> # kldload zfs > >>>>> # zpool import > >>>>> pool: zroot > >>>>> id: 13407092850382881815 > >>>>> state: ONLINE > >>>>> status: The pool was last accessed by another system. > >>>>> action: The pool can be imported using its name or numeric identifier and > >>>>> the '-f' flag. > >>>>> see: http://illumos.org/msg/ZFS-8000-EY > >>>>> config: > >>>>> > >>>>> zroot ONLINE > >>>>> mfid0p3 ONLINE > >>>>> # zpool import -fR /mnt zroot > >>>>> # zpool list > >>>>> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT > >>>>> zroot 19.9T 129G 19.7T - 0% 0% 1.00x ONLINE /mnt > >>>>> # zpool get all zroot > >>>>> NAME PROPERTY VALUE SOURCE > >>>>> zroot size 19.9T - > >>>>> zroot capacity 0% - > >>>>> zroot altroot /mnt local > >>>>> zroot health ONLINE - > >>>>> zroot guid 13407092850382881815 default > >>>>> zroot version - default > >>>>> zroot bootfs zroot/ROOT/default local > >>>>> zroot delegation on default > >>>>> zroot autoreplace off default > >>>>> zroot cachefile none local > >>>>> zroot failmode wait default > >>>>> zroot listsnapshots off default > >>>>> zroot autoexpand off default > >>>>> zroot dedupditto 0 default > >>>>> zroot dedupratio 1.00x - > >>>>> zroot free 19.7T - > >>>>> zroot allocated 129G - > >>>>> zroot readonly off - > >>>>> zroot comment - default > >>>>> zroot expandsize - - > >>>>> zroot freeing 0 default > >>>>> zroot fragmentation 0% - > >>>>> zroot leaked 0 default > >>>>> zroot feature@async_destroy enabled local > >>>>> zroot feature@empty_bpobj active local > >>>>> zroot feature@lz4_compress active local > >>>>> zroot feature@multi_vdev_crash_dump enabled local > >>>>> zroot feature@spacemap_histogram active local > >>>>> zroot feature@enabled_txg active local > >>>>> zroot feature@hole_birth active local > >>>>> zroot feature@extensible_dataset enabled local > >>>>> zroot feature@embedded_data active local > >>>>> zroot feature@bookmarks enabled local > >>>>> zroot feature@filesystem_limits enabled local > >>>>> zroot feature@large_blocks enabled local > >>>>> zroot feature@sha512 enabled local > >>>>> zroot feature@skein enabled local > >>>>> zroot unsupported@com.delphix:device_removal inactive local > >>>>> zroot unsupported@com.delphix:obsolete_counts inactive local > >>>>> zroot unsupported@com.delphix:zpool_checkpoint inactive local > >>>>> # > >>>>> > >>>>> Regards > >>>>> > >>>>> [1] https://lists.freebsd.org/pipermail/freebsd-current/2018-March/068886.html > >>>>> [2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=151910 > >>>>> > >>>>> --- > >>>>> KIRIYAMA Kazuhiko > >>>>> _______________________________________________ > >>>>> freebsd-current@freebsd.org mailing list > >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current > >>>>> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > >>>>> > >>>> > >>>> I am guessing it means something is corrupt, as 16MB is the maximum size > >>>> of a record in ZFS. Also, the 'large_blocks' feature is 'enabled', not > >>>> 'active', so this suggest you do not have any records larger than 128kb > >>>> on your pool. > >>> > >>> As I mentioned above, [2] says ZFS on RAID disks have any > >>> serious bugs except for mirror. Anyway I gave up to use ZFS > >>> on RAID{5,6}* until Bug 151910 [2] fixed. > >>> > >> > >> if you boot from usb stick (or cd), press esc at boot loader menu and enter lsdev -v. what sector and disk sizes are reported? > > > > OK lsdev -v > > disk devices: > > disk0: BIOS drive C (31588352 X 512) > > disk0p1: FreeBSD boot 512KB > > disk0p2: FreeBSD UFS 13GB > > disk0p3: FreeBSD swap 771MB > > disk1: BIOS drive D (4294967295 X 512) > > disk0p1: FreeBSD boot 512KB > > disk0p2: FreeBSD swap 128GB > > disk0p3: FreeBSD ZFS 19TB > > OK > > > > Does this means whole disk size that I can use is > > 2TB (4294967295 X 512) ? > > > Yes, or to be exact, that is the disk size reported by the INT13; and as below you do get the same value from UEFI, the limit seems to be set by the RAID controller itself. In this case it means that the best way to address the issue is to create one smaller lun for boot disk (zroot pool) and larger for data. Or of course you can have separate FreeBSD ZFS partition for zroot, just make sure it will fit inside the first 2TB. > > Of course there may be option for RAID firmware update, or configuration settings for lun, or use JBOD mode (if supported by the card). JBOD would be the best because in the current setup, the pool is vulnerable against silent data corruption (checksum errors) and has no way to recover (this is the reason why RAID setups are not preferred with zfs). My RAID card is AVAGO MegaRAID (SAS-MFI BIOS Version 6.36.00.0) and find it to be enable JBOD-mode. So I change RAID-mode to JBOD-mode and make each disk to JBOD. Then reboot and checked at loader prompt 'lsdev -v', all disk is recognized as single device 'mfidx' (x=0,2,..,11). Anyway I re-installed as ZFS RAIDZ-3 with UEFI boot. Result is fine !!! Each disk was recoginized up to 2TB as a ZFS file system and built a zpool (zroot) as raidz3 from those disks: OK lsdev -v PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,80) disk0: 3907029168 X 512 blocks disk0p1: EFI 200MB disk0p2: FreeBSD swap 8192MBB disk0p3: FreeBSD ZFS 1854GBB PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,81) disk1: 3907029168 X 512 blocks disk1p1: EFI 200MB disk1p2: FreeBSD swap 8192MBB disk1p3: FreeBSD ZFS 1854GBB PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,82) disk1: 3907029168 X 512 blocks disk1p1: EFI 200MB disk1p2: FreeBSD swap 8192MBB disk1p3: FreeBSD ZFS 1854GBB : PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,8B) disk11: 3907029168 X 512 blocks disk11p1: EFI 200MB disk11p2: FreeBSD swap 8192MBB disk11p3: FreeBSD ZFS 1854GBB net devices: zfs devices: pool: zroot bootfs: zroot/ROOT/default config: NAME STATE zroot ONLINE raidz3 ONLINE mfid0p3 ONLINE mfid1p3 ONLINE mfid2p3 ONLINE mfid3p3 ONLINE mfid4p3 ONLINE mfid5p3 ONLINE mfid6p3 ONLINE mfid7p3 ONLINE mfid8p3 ONLINE mfid9p3 ONLINE mfid10p3 ONLINE mfid11p3 ONLINE OK Built-up ZFS file system on FreeBSD 12.0-CURRENT (r335317) is as follwos: # gpart show mfid0 => 40 3907029088 mfid0 GPT (1.8T) 40 409600 1 efi (200M) 409640 2008 - free - (1.0M) 411648 16777216 2 freebsd-swap (8.0G) 17188864 3889840128 3 freebsd-zfs (1.8T) 3907028992 136 - free - (68K) # zpool status pool: zroot state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 raidz3-0 ONLINE 0 0 0 mfid0p3 ONLINE 0 0 0 mfid1p3 ONLINE 0 0 0 mfid2p3 ONLINE 0 0 0 mfid3p3 ONLINE 0 0 0 mfid4p3 ONLINE 0 0 0 mfid5p3 ONLINE 0 0 0 mfid6p3 ONLINE 0 0 0 mfid7p3 ONLINE 0 0 0 mfid8p3 ONLINE 0 0 0 mfid9p3 ONLINE 0 0 0 mfid10p3 ONLINE 0 0 0 mfid11p3 ONLINE 0 0 0 errors: No known data errors # zpool list NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT zroot 21.6T 2.55G 21.6T - - 0% 0% 1.00x ONLINE - # zpool get all zroot NAME PROPERTY VALUE SOURCE zroot size 21.6T - zroot capacity 0% - zroot altroot - default zroot health ONLINE - zroot guid 2002381236893751526 default zroot version - default zroot bootfs zroot/ROOT/default local zroot delegation on default zroot autoreplace off default zroot cachefile - default zroot failmode wait default zroot listsnapshots off default zroot autoexpand off default zroot dedupditto 0 default zroot dedupratio 1.00x - zroot free 21.6T - zroot allocated 2.55G - zroot readonly off - zroot comment - default zroot expandsize - - zroot freeing 0 default zroot fragmentation 0% - zroot leaked 0 default zroot bootsize - default zroot checkpoint - - zroot feature@async_destroy enabled local zroot feature@empty_bpobj active local zroot feature@lz4_compress active local zroot feature@multi_vdev_crash_dump enabled local zroot feature@spacemap_histogram active local zroot feature@enabled_txg active local zroot feature@hole_birth active local zroot feature@extensible_dataset enabled local zroot feature@embedded_data active local zroot feature@bookmarks enabled local zroot feature@filesystem_limits enabled local zroot feature@large_blocks enabled local zroot feature@sha512 enabled local zroot feature@skein enabled local zroot feature@device_removal enabled local zroot feature@obsolete_counts enabled local zroot feature@zpool_checkpoint enabled local # uname -a FreeBSD vm.openedu.org 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r335317: Mon Jun 18 16:21:17 UTC 2018 root@releng3.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 # df -h Filesystem Size Used Avail Capacity Mounted on zroot/ROOT/default 15T 191M 15T 0% / devfs 1.0K 1.0K 0B 100% /dev zroot/.dake 15T 256K 15T 0% /.dake zroot/ds 15T 279K 15T 0% /ds zroot/ds/backup 15T 256K 15T 0% /ds/backup zroot/ds/distfiles 15T 256K 15T 0% /ds/distfiles zroot/ds/obj 15T 256K 15T 0% /ds/obj zroot/ds/packages 15T 256K 15T 0% /ds/packages zroot/ds/ports 15T 256K 15T 0% /ds/ports zroot/ds/src 15T 256K 15T 0% /ds/src zroot/tmp 15T 302K 15T 0% /tmp zroot/usr 15T 1.6G 15T 0% /usr zroot/usr/home 15T 372K 15T 0% /usr/home zroot/usr/local 15T 256K 15T 0% /usr/local zroot/var 15T 395K 15T 0% /var zroot/var/audit 15T 256K 15T 0% /var/audit zroot/var/crash 15T 256K 15T 0% /var/crash zroot/var/db 15T 9.2M 15T 0% /var/db zroot/var/empty 15T 256K 15T 0% /var/empty zroot/var/log 15T 337K 15T 0% /var/log zroot/var/mail 15T 256K 15T 0% /var/mail zroot/var/ports 15T 256K 15T 0% /var/ports zroot/var/run 15T 442K 15T 0% /var/run zroot/var/tmp 15T 256K 15T 0% /var/tmp zroot/vm 15T 256K 15T 0% /vm zroot 15T 256K 15T 0% /zroot # Thankx for benignant advice ! > > rgds, > toomas > > > > > > >> > >> the issue [2] is mix of ancient freebsd (v 8.1 is mentioned there), and RAID luns with 512B sector size and 15TB!!! total size - are you really sure your BIOS can actually address 15TB lun (with 512B sector size)? Note that the problem with large disks can hide itself till you have pool filled up enough till the essential files will be stored above the limit~ meaning that you may have ~perfectly working~ setup till at some point in time, after next update, it is suddenly not working any more. > >> > > > > I see why I could use for a while. > > > >> Note that for boot loader we have only INT13h for BIOS version, and it really is limited. The UEFI version is using EFI_BLOCK_IO API, which usually can handle large sectors and disk sizes better. > > > > I re-installed the machine with UEFI boot: > > > > # gpart show mfid0 > > => 40 42965401520 mfid0 GPT (20T) > > 40 409600 1 efi (200M) > > 409640 2008 - free - (1.0M) > > 411648 268435456 2 freebsd-swap (128G) > > 268847104 42696552448 3 freebsd-zfs (20T) > > 42965399552 2008 - free - (1.0M) > > > > # uname -a > > FreeBSD vm.openedu.org 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r335317: Mon Jun 18 16:21:17 UTC 2018 root@releng3.nyi.freebsd.org :/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64 > > # zpool get all zroot > > NAME PROPERTY VALUE SOURCE > > zroot size 19.9T - > > zroot capacity 0% - > > zroot altroot - default > > zroot health ONLINE - > > zroot guid 11079446129259852576 default > > zroot version - default > > zroot bootfs zroot/ROOT/default local > > zroot delegation on default > > zroot autoreplace off default > > zroot cachefile - default > > zroot failmode wait default > > zroot listsnapshots off default > > zroot autoexpand off default > > zroot dedupditto 0 default > > zroot dedupratio 1.00x - > > zroot free 19.9T - > > zroot allocated 1.67G - > > zroot readonly off - > > zroot comment - default > > zroot expandsize - - > > zroot freeing 0 default > > zroot fragmentation 0% - > > zroot leaked 0 default > > zroot bootsize - default > > zroot checkpoint - - > > zroot feature@async_destroy enabled local > > zroot feature@empty_bpobj active local > > zroot feature@lz4_compress active local > > zroot feature@multi_vdev_crash_dump enabled local > > zroot feature@spacemap_histogram active local > > zroot feature@enabled_txg active local > > zroot feature@hole_birth active local > > zroot feature@extensible_dataset enabled local > > zroot feature@embedded_data active local > > zroot feature@bookmarks enabled local > > zroot feature@filesystem_limits enabled local > > zroot feature@large_blocks enabled local > > zroot feature@sha512 enabled local > > zroot feature@skein enabled local > > zroot feature@device_removal enabled local > > zroot feature@obsolete_counts enabled local > > zroot feature@zpool_checkpoint enabled local > > # > > > > and checked 'lsdev -v' at loader prompt: > > > > OK lsdev -v > > PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C93EC93B,80) > > disk0: 4294967295 X 512 blocks > > disk0p1: EFI 200MB > > disk0p2: FreeBSD swap 128GB > > disk0p2: FreeBSD ZFS 19TB > > net devices: > > zfs devices: > > pool: zroot > > bootfs: zroot/ROOT/default > > config: > > > > NAME STATE > > zroot ONLINE > > mfid0p3 ONLINE > > OK > > > > but disk size (4294967295 X 512) still not changed or this > > means 4294967295 X 512 X 512 bytes ? > > > >> > >> rgds, > >> toomas > >> > >> _______________________________________________ > >> freebsd-current@freebsd.org mailing list > >> https://lists.freebsd.org/mailman/listinfo/freebsd-current > >> To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org " > > > > Regards > > > > --- > > KIRIYAMA Kazuhiko > > _______________________________________________ > freebsd-current@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > --- KIRIYAMA Kazuhiko