From owner-freebsd-current@freebsd.org Tue Jun 26 06:48:11 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 18F4E10148AB for ; Tue, 26 Jun 2018 06:48:11 +0000 (UTC) (envelope-from tsoome@me.com) Received: from st13p35im-asmtp002.me.com (st13p35im-asmtp002.me.com [17.164.199.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 9CC81828E9; Tue, 26 Jun 2018 06:48:10 +0000 (UTC) (envelope-from tsoome@me.com) Received: from process-dkim-sign-daemon.st13p35im-asmtp002.me.com by st13p35im-asmtp002.me.com (Oracle Communications Messaging Server 8.0.1.2.20170607 64bit (built Jun 7 2017)) id <0PAX000003L7JO00@st13p35im-asmtp002.me.com>; Tue, 26 Jun 2018 06:48:04 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=me.com; s=04042017; t=1529995684; bh=7uMPTQUbR2Ti/7B3a2ToCEap+Cqul1ZTKugJ5fFwUUg=; h=From:Message-id:Content-type:MIME-version:Subject:Date:To; b=3X30Os9hW5/h59xpNSb9JwJrMXe8tuFUA90Pjscx+5e309KV0HcLo2GdwerRDEcl2 SnGvbjs9NwcaZBTb47dIPATZD5hOy31oHDoyRxmcSV6P0Mdo1pqGTB+3rvgJNNLbPx XeZzhhO+UxeWjn6wOfICZZzmI3XGQdggyfUD8nKsgUjAOlje0BUDLkcSd4x7SGiyK0 qKOefg7k5kVnIvezW2+znj1HgZAdv1KVk2DcfOYRwx9mDcZCFyWvOJL5biwWgBVwBk 3zY8SMBnvqiWg6YwsQQPousYMZKgeqPRDtZrqg9Efx07e4wcefYYbFKRUNjzmTlV/8 9w3RAv4zG//sg== Received: from icloud.com ([127.0.0.1]) by st13p35im-asmtp002.me.com (Oracle Communications Messaging Server 8.0.1.2.20170607 64bit (built Jun 7 2017)) with ESMTPSA id <0PAX005W247S1K40@st13p35im-asmtp002.me.com>; Tue, 26 Jun 2018 06:48:02 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2018-06-26_03:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1015 suspectscore=8 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1806260077 From: Toomas Soome Message-id: <63C1AB52-1A4B-430E-9D88-6406107785BA@me.com> MIME-version: 1.0 (Mac OS X Mail 11.4 \(3445.8.2\)) Subject: Re: ZFS: I/O error - blocks larger than 16777216 are not supported Date: Tue, 26 Jun 2018 09:48:10 +0300 In-reply-to: <201806260208.w5Q28Una093666@kx.openedu.org> Cc: Allan Jude , freebsd-current@freebsd.org To: KIRIYAMA Kazuhiko References: <201806210136.w5L1a5Nv074194@kx.openedu.org> <21493592-4eb2-59c5-1b0d-e1d08217a96b@freebsd.org> <201806210600.w5L60mYn079435@kx.openedu.org> <1CDD5AFE-F115-406C-AB92-9DC58B57E1D5@me.com> <201806260208.w5Q28Una093666@kx.openedu.org> X-Mailer: Apple Mail (2.3445.8.2) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.26 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jun 2018 06:48:12 -0000 > On 26 Jun 2018, at 05:08, KIRIYAMA Kazuhiko = wrote: >=20 > At Thu, 21 Jun 2018 10:48:28 +0300, > Toomas Soome wrote: >>=20 >>=20 >>=20 >>> On 21 Jun 2018, at 09:00, KIRIYAMA Kazuhiko = wrote: >>>=20 >>> At Wed, 20 Jun 2018 23:34:48 -0400, >>> Allan Jude wrote: >>>>=20 >>>> On 2018-06-20 21:36, KIRIYAMA Kazuhiko wrote: >>>>> Hi all, >>>>>=20 >>>>> I've been reported ZFS boot disable problem [1], and found >>>>> that this issue occers form RAID configuration [2]. So I >>>>> rebuit with RAID5 and re-installed 12.0-CURRENT >>>>> (r333982). But failed to boot with: >>>>>=20 >>>>> ZFS: i/o error - all block copies unavailable >>>>> ZFS: can't read MOS of pool zroot >>>>> gptzfsboot: failed to mount default pool zroot >>>>>=20 >>>>> FreeBSD/x86 boot >>>>> ZFS: I/O error - blocks larger than 16777216 are not supported >>>>> ZFS: can't find dataset u >>>>> Default: zroot/<0x0>: >>>>>=20 >>>>> In this case, the reason is "blocks larger than 16777216 are >>>>> not supported" and I guess this means datasets that have >>>>> recordsize greater than 8GB is NOT supported by the >>>>> FreeBSD boot loader(zpool-features(7)). Is that true ? >>>>>=20 >>>>> My zpool featues are as follows: >>>>>=20 >>>>> # kldload zfs >>>>> # zpool import=20 >>>>> pool: zroot >>>>> id: 13407092850382881815 >>>>> state: ONLINE >>>>> status: The pool was last accessed by another system. >>>>> action: The pool can be imported using its name or numeric = identifier and >>>>> the '-f' flag. >>>>> see: http://illumos.org/msg/ZFS-8000-EY >>>>> config: >>>>>=20 >>>>> zroot ONLINE >>>>> mfid0p3 ONLINE >>>>> # zpool import -fR /mnt zroot >>>>> # zpool list >>>>> NAME SIZE ALLOC FREE EXPANDSZ FRAG CAP DEDUP HEALTH = ALTROOT >>>>> zroot 19.9T 129G 19.7T - 0% 0% 1.00x ONLINE = /mnt >>>>> # zpool get all zroot >>>>> NAME PROPERTY VALUE = SOURCE >>>>> zroot size 19.9T = - >>>>> zroot capacity 0% = - >>>>> zroot altroot /mnt = local >>>>> zroot health ONLINE = - >>>>> zroot guid = 13407092850382881815 default >>>>> zroot version - = default >>>>> zroot bootfs = zroot/ROOT/default local >>>>> zroot delegation on = default >>>>> zroot autoreplace off = default >>>>> zroot cachefile none = local >>>>> zroot failmode wait = default >>>>> zroot listsnapshots off = default >>>>> zroot autoexpand off = default >>>>> zroot dedupditto 0 = default >>>>> zroot dedupratio 1.00x = - >>>>> zroot free 19.7T = - >>>>> zroot allocated 129G = - >>>>> zroot readonly off = - >>>>> zroot comment - = default >>>>> zroot expandsize - = - >>>>> zroot freeing 0 = default >>>>> zroot fragmentation 0% = - >>>>> zroot leaked 0 = default >>>>> zroot feature@async_destroy enabled = local >>>>> zroot feature@empty_bpobj active = local >>>>> zroot feature@lz4_compress active = local >>>>> zroot feature@multi_vdev_crash_dump enabled = local >>>>> zroot feature@spacemap_histogram active = local >>>>> zroot feature@enabled_txg active = local >>>>> zroot feature@hole_birth active = local >>>>> zroot feature@extensible_dataset enabled = local >>>>> zroot feature@embedded_data active = local >>>>> zroot feature@bookmarks enabled = local >>>>> zroot feature@filesystem_limits enabled = local >>>>> zroot feature@large_blocks enabled = local >>>>> zroot feature@sha512 enabled = local >>>>> zroot feature@skein enabled = local >>>>> zroot unsupported@com.delphix:device_removal inactive = local >>>>> zroot unsupported@com.delphix:obsolete_counts inactive = local >>>>> zroot unsupported@com.delphix:zpool_checkpoint inactive = local >>>>> #=20 >>>>>=20 >>>>> Regards >>>>>=20 >>>>> [1] = https://lists.freebsd.org/pipermail/freebsd-current/2018-March/068886.html= >>>>> [2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D151910 >>>>>=20 >>>>> --- >>>>> KIRIYAMA Kazuhiko >>>>> _______________________________________________ >>>>> freebsd-current@freebsd.org mailing list >>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current >>>>> To unsubscribe, send any mail to = "freebsd-current-unsubscribe@freebsd.org" >>>>>=20 >>>>=20 >>>> I am guessing it means something is corrupt, as 16MB is the maximum = size >>>> of a record in ZFS. Also, the 'large_blocks' feature is 'enabled', = not >>>> 'active', so this suggest you do not have any records larger than = 128kb >>>> on your pool. >>>=20 >>> As I mentioned above, [2] says ZFS on RAID disks have any >>> serious bugs except for mirror. Anyway I gave up to use ZFS >>> on RAID{5,6}* until Bug 151910 [2] fixed. >>>=20 >>=20 >> if you boot from usb stick (or cd), press esc at boot loader menu and = enter lsdev -v. what sector and disk sizes are reported? >=20 > OK lsdev -v > disk devices: > disk0: BIOS drive C (31588352 X 512) > disk0p1: FreeBSD boot 512KB > disk0p2: FreeBSD UFS 13GB > disk0p3: FreeBSD swap 771MB > disk1: BIOS drive D (4294967295 X 512) > disk0p1: FreeBSD boot 512KB > disk0p2: FreeBSD swap 128GB > disk0p3: FreeBSD ZFS 19TB > OK >=20 > Does this means whole disk size that I can use is > 2TB (4294967295 X 512) ?=20 Yes, or to be exact, that is the disk size reported by the INT13; and as = below you do get the same value from UEFI, the limit seems to be set by = the RAID controller itself. In this case it means that the best way to = address the issue is to create one smaller lun for boot disk (zroot = pool) and larger for data. Or of course you can have separate FreeBSD = ZFS partition for zroot, just make sure it will fit inside the first = 2TB. Of course there may be option for RAID firmware update, or configuration = settings for lun, or use JBOD mode (if supported by the card). JBOD = would be the best because in the current setup, the pool is vulnerable = against silent data corruption (checksum errors) and has no way to = recover (this is the reason why RAID setups are not preferred with zfs). rgds, toomas >=20 >=20 >>=20 >> the issue [2] is mix of ancient freebsd (v 8.1 is mentioned there), = and RAID luns with 512B sector size and 15TB!!! total size - are you = really sure your BIOS can actually address 15TB lun (with 512B sector = size)? Note that the problem with large disks can hide itself till you = have pool filled up enough till the essential files will be stored above = the limit~ meaning that you may have ~perfectly working~ setup till at = some point in time, after next update, it is suddenly not working any = more. >>=20 >=20 > I see why I could use for a while. >=20 >> Note that for boot loader we have only INT13h for BIOS version, and = it really is limited. The UEFI version is using EFI_BLOCK_IO API, which = usually can handle large sectors and disk sizes better. >=20 > I re-installed the machine with UEFI boot: >=20 > # gpart show mfid0 > =3D> 40 42965401520 mfid0 GPT (20T) > 40 409600 1 efi (200M) > 409640 2008 - free - (1.0M) > 411648 268435456 2 freebsd-swap (128G) > 268847104 42696552448 3 freebsd-zfs (20T) > 42965399552 2008 - free - (1.0M) >=20 > # uname -a > FreeBSD vm.openedu.org 12.0-CURRENT FreeBSD = 12.0-CURRENT #0 r335317: Mon Jun 18 16:21:17 UTC 2018 = root@releng3.nyi.freebsd.org = :/usr/obj/usr/src/amd64.amd64/sys/GEN= ERIC amd64 > # zpool get all zroot > NAME PROPERTY VALUE = SOURCE > zroot size 19.9T - > zroot capacity 0% - > zroot altroot - = default > zroot health ONLINE - > zroot guid 11079446129259852576 = default > zroot version - = default > zroot bootfs zroot/ROOT/default = local > zroot delegation on = default > zroot autoreplace off = default > zroot cachefile - = default > zroot failmode wait = default > zroot listsnapshots off = default > zroot autoexpand off = default > zroot dedupditto 0 = default > zroot dedupratio 1.00x - > zroot free 19.9T - > zroot allocated 1.67G - > zroot readonly off - > zroot comment - = default > zroot expandsize - - > zroot freeing 0 = default > zroot fragmentation 0% - > zroot leaked 0 = default > zroot bootsize - = default > zroot checkpoint - - > zroot feature@async_destroy enabled = local > zroot feature@empty_bpobj active = local > zroot feature@lz4_compress active = local > zroot feature@multi_vdev_crash_dump enabled = local > zroot feature@spacemap_histogram active = local > zroot feature@enabled_txg active = local > zroot feature@hole_birth active = local > zroot feature@extensible_dataset enabled = local > zroot feature@embedded_data active = local > zroot feature@bookmarks enabled = local > zroot feature@filesystem_limits enabled = local > zroot feature@large_blocks enabled = local > zroot feature@sha512 enabled = local > zroot feature@skein enabled = local > zroot feature@device_removal enabled = local > zroot feature@obsolete_counts enabled = local > zroot feature@zpool_checkpoint enabled = local > #=20 >=20 > and checked 'lsdev -v' at loader prompt: >=20 > OK lsdev -v > = PciRoot(0x0)/Pci(0x1,0x0)/Pci(0x0,0x0)/VenHw(CF31FAC5-C24E-11D2-85F3-00A0C= 93EC93B,80) > disk0: 4294967295 X 512 blocks > disk0p1: EFI 200MB > disk0p2: FreeBSD swap 128GB > disk0p2: FreeBSD ZFS 19TB > net devices: > zfs devices: > pool: zroot > bootfs: zroot/ROOT/default > config: >=20 > NAME STATE > zroot ONLINE > mfid0p3 ONLINE > OK >=20 > but disk size (4294967295 X 512) still not changed or this > means 4294967295 X 512 X 512 bytes ? >=20 >>=20 >> rgds, >> toomas >>=20 >> _______________________________________________ >> freebsd-current@freebsd.org = mailing list >> https://lists.freebsd.org/mailman/listinfo/freebsd-current = >> To unsubscribe, send any mail to = "freebsd-current-unsubscribe@freebsd.org = " >=20 > Regards >=20 > --- > KIRIYAMA Kazuhiko