From owner-freebsd-stable@FreeBSD.ORG Thu Sep 11 01:18:50 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E8DACFE0 for ; Thu, 11 Sep 2014 01:18:50 +0000 (UTC) Received: from st11p09mm-asmtp002.mac.com (st11p09mm-asmtp002.mac.com [17.164.24.97]) (using TLSv1 with cipher DES-CBC3-SHA (112/168 bits)) (Client CN "smtp.me.com", Issuer "VeriSign Class 3 Extended Validation SSL SGC CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id B25696DF for ; Thu, 11 Sep 2014 01:18:50 +0000 (UTC) Received: from [10.71.14.16] (dsl-hkibrasgw1-58c380-33.dhcp.inet.fi [88.195.128.33]) by st11p09mm-asmtp002.mac.com (Oracle Communications Messaging Server 7u4-27.10(7.0.4.27.9) 64bit (built Jun 6 2014)) with ESMTPSA id <0NBP00NV0NJ21H40@st11p09mm-asmtp002.mac.com> for freebsd-stable@freebsd.org; Thu, 11 Sep 2014 00:18:41 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.28,0.0.0000 definitions=2014-09-10_04:2014-09-09,2014-09-10,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=52 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1409110002 Subject: Re: ZFS on root booting broken somewhere after r270020 MIME-version: 1.0 (Mac OS X Mail 8.0 \(1973.6\)) Content-type: text/plain; charset=utf-8 From: Kimmo Paasiala X-Priority: 3 In-reply-to: Date: Thu, 11 Sep 2014 03:18:37 +0300 Content-transfer-encoding: quoted-printable Message-id: References: <51AD1F36-1089-481F-8784-8BD8E6EF020F@icloud.com> <71DEB316-3CDD-4403-A397-BCE684725ABD@icloud.com> <25886C53-39C1-47A8-95F7-494FA6E7ABA2@icloud.com> <20140819071045.GS2737@kib.kiev.ua> <99FB0662-1954-4ECB-939B-06D0AA49C1A1@icloud.com> <20140819074643.GU2737@kib.kiev.ua> <7F008C560B48412AB66A1EBD9382DDAE@multiplay.co.uk> <9315C209-701A-49EF-85D3-ACCCD1513EC3@icloud.com> <959C54D2C8EB4AC8983DC1DA3CE042E3@multiplay.co.uk> <9F24DD48FBEA46C39F98DF600D46DA1A@multiplay.co.uk> To: Steven Hartland X-Mailer: Apple Mail (2.1973.6) Cc: "freebsd-stable@freebsd.org" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Sep 2014 01:18:51 -0000 > On 11.9.2014, at 3.04, Kimmo Paasiala wrote: >=20 >=20 >> On 11.9.2014, at 2.41, Steven Hartland = wrote: >>=20 >>=20 >> ----- Original Message ----- From: "Steven Hartland" = >> To: "Kimmo Paasiala" >> Cc: >> Sent: Wednesday, September 10, 2014 11:36 PM >> Subject: Re: ZFS on root booting broken somewhere after r270020 >>=20 >>=20 >>>=20 >>> ----- Original Message ----- From: "Kimmo Paasiala" = >>> To: "Steven Hartland" >>> Cc: >>> Sent: Wednesday, September 10, 2014 8:26 PM >>> Subject: Re: ZFS on root booting broken somewhere after r270020 >>>=20 >>>=20 >>>>=20 >>>>> On 9.9.2014, at 19.03, Kimmo Paasiala wrote: >>>>>=20 >>>>>=20 >>>>>> On 9.9.2014, at 18.53, Steven Hartland = wrote: >>>>>>=20 >>>>>> ----- Original Message ----- From: "Kimmo Paasiala" = >>>>>>> Hi it=E2=80=99s me again. Something that was committed in = stable/10 after r271213 up to >>>>>>> and including r271288 broke ZFS on Root booting in exactly the = same way again. >>>>>>> I know the problem is no longer related to extra kernel modules = loaded in >>>>>>> /boot/loader.conf because I=E2=80=99m loading only the required = zfs.ko and opensolaris.ko >>>>>>> modules. Also, the new vt(4) console that I=E2=80=99m using is = not the culprit because the >>>>>>> same thing happens with kern.vty set to =E2=80=9Csc=E2=80=9D. >>>>>>=20 >>>>>> I've just updated my stable/10 box to r271316 and no problems = booting from a ZFS root. >>>>>>=20 >>>>>> So first things first what error are you seeing? >>>>>>=20 >>>>>> Next what is you're: >>>>>> * Hardware >>>>>> * Pool layout >>>>>>=20 >>>>>> Regards >>>>>> Steve >>>>>=20 >>>>> The error is the same as before: >>>>>=20 >>>>> =E2=80=A2 Mounting from zfs:rdnzltank/ROOT/default failed with = error 5. >>>>>=20 >>>>> Followed by the mountroot prompt and I get only these devices to = choose from, no sign of the ZFS pool: >>>>>=20 >>>>> =E2=80=A2 mountroot> >>>>> =E2=80=A2 List of GEOM managed disk devices: >>>>> =E2=80=A2 gpt/fb10disk1 gpt/fb10swap1 = diskid/DISK-S13UJDWS301624p3 diskid/DISK-S13UJDWS301624p2 = diskid/DISK-S13UJDWS301624p1 ada0p3 ada0p2 ada0p1 = diskid/DISK-S13UJDWS301624 ada0 >>>>>=20 >>>>> Hardware is a Gigabyte GA-D510UD Mini-ITX motherboard: >>>>>=20 >>>>> http://www.gigabyte.com/products/product-page.aspx?pid=3D3343#ov >>>>>=20 >>>>> 4GBs of RAM. One 750GB Samsung HD753LJ 3.5=E2=80=9D SATA HD on the = Intel SATA controller. >>>>>=20 >>>>> Pool layout: >>>>>=20 >>>>> pool: rdnzltank >>>>> state: ONLINE >>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 = 09:27:48 2014 >>>>> config: >>>>>=20 >>>>> NAME STATE READ WRITE CKSUM >>>>> rdnzltank ONLINE 0 0 0 >>>>> gpt/fb10disk1 ONLINE 0 0 0 >>>>>=20 >>>>> errors: No known data errors >>>>>=20 >>>>> Output of =E2=80=98gpart show=E2=80=99: >>>>>=20 >>>>> freebsd10 ~ % gpart show >>>>> =3D> 34 1465146988 ada0 GPT (699G) >>>>> 34 2014 - free - (1.0M) >>>>> 2048 1024 1 freebsd-boot (512K) >>>>> 3072 1024 - free - (512K) >>>>> 4096 16777216 2 freebsd-swap (8.0G) >>>>> 16781312 1448365710 3 freebsd-zfs (691G) >>>>>=20 >>>>>=20 >>>>> HTH, >>>>>=20 >>>>> -Kimmo >>>>=20 >>>>=20 >>>> More information. This version still works: >>>>=20 >>>> FreeBSD freebsd10.rdnzl.info 10.1-PRERELEASE FreeBSD = 10.1-PRERELEASE #0 r271237: Wed Sep 10 11:00:15 EEST 2014 = root@buildstable10amd64.rdnzl.info:/usr/obj/usr/src/sys/GENERIC amd64 >>>>=20 >>>> The next higher version r271238 breaks booting for me. The commit = in question is this one: >>>>=20 >>>> = http://svnweb.freebsd.org/base?view=3Drevision&sortby=3Drev&sortdir=3Ddown= &revision=3D271238 >>>=20 >>> Investigating, had no reports of issues while this has been in head. >>=20 >> I've just installed a stable/10 kernel, specifically: >> 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #11 r271316M >>=20 >> and booted fine from a mirrored root without issue: >> config: >>=20 >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> ada0p3 ONLINE 0 0 0 >> ada2p3 ONLINE 0 0 0 >>=20 >> gpart show ada0 ada2 >> =3D> 34 250069613 ada0 GPT (119G) >> 34 128 1 freebsd-boot (64K) >> 162 8388608 2 freebsd-swap (4.0G) >> 8388770 241680877 3 freebsd-zfs (115G) >>=20 >> =3D> 40 586072288 ada2 GPT (279G) >> 40 128 1 freebsd-boot (64K) >> 168 8388608 2 freebsd-swap (4.0G) >> 8388776 577683552 3 freebsd-zfs (275G) >>=20 >> I then detached the second disk so the machine had just: >> config: >>=20 >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> ada0p3 ONLINE 0 0 0 >>=20 >> Rebooted and again all fine no issues >>=20 >> I've also got a raidz1 box on the same kernel it too is fine. >>=20 >> =3D> 34 500118125 ada0 GPT (238G) >> 34 128 1 freebsd-boot (64K) >> 162 500117997 2 freebsd-zfs (238G) >> ... >>=20 >> So its seems like there's something odd about your environment, = especially >> given you've had a similar issue before. >>=20 >> So the questions: >> 1. What does zpool get all report? >> 2. What does /boot/loader.conf have in it? >> 3. What does zdb -C rdnzltank report? >> 4. What does /etc/rc.conf have in it? >>=20 >> Regards >> Steve=20 >=20 > Here goes: >=20 > freebsd10 ~ % zpool get all rdnzltank=20 > NAME PROPERTY VALUE = SOURCE > rdnzltank size 688G = - > rdnzltank capacity 9% = - > rdnzltank altroot - = default > rdnzltank health ONLINE = - > rdnzltank guid 5382786142589818227 = default > rdnzltank version - = default > rdnzltank bootfs rdnzltank/ROOT/default = local > rdnzltank delegation on = default > rdnzltank autoreplace off = default > rdnzltank cachefile - = default > rdnzltank failmode wait = default > rdnzltank listsnapshots off = default > rdnzltank autoexpand off = default > rdnzltank dedupditto 0 = default > rdnzltank dedupratio 1.00x = - > rdnzltank free 622G = - > rdnzltank allocated 66.2G = - > rdnzltank readonly off = - > rdnzltank comment - = default > rdnzltank expandsize 0 = - > rdnzltank freeing 0 = default > rdnzltank fragmentation 20% = - > rdnzltank leaked 0 = default > rdnzltank feature@async_destroy enabled = local > rdnzltank feature@empty_bpobj active = local > rdnzltank feature@lz4_compress active = local > rdnzltank feature@multi_vdev_crash_dump enabled = local > rdnzltank feature@spacemap_histogram active = local > rdnzltank feature@enabled_txg active = local > rdnzltank feature@hole_birth active = local > rdnzltank feature@extensible_dataset enabled = local > rdnzltank feature@embedded_data active = local > rdnzltank feature@bookmarks enabled = local > rdnzltank feature@filesystem_limits enabled = local >=20 > freebsd10 ~ % cat /boot/loader.conf =20 >=20 > kern.geom.label.gptid.enable=3D0 > hw.usb.no_pf=3D1 > kern.cam.ada.legacy_aliases=3D0 > zfs_load=3D"YES" > vfs.zfs.prefetch_disable=3D0 > kern.vty=3Dvt >=20 > I have already tried without the gptid and legacy_aliases options, no = difference. The prefetch_disable was at the default setting 1 when the = problem appeared. The hw.usb.no_pf setting shouldn=E2=80=99t have an = effect but I can test it once I can reboot the machine again. I=E2=80=99m = attaching a second disk at the moment to make a mirror of the pool. The = kern.vty setting didn=E2=80=99t make a difference. >=20 > The next is now with the second disk being resilvered, gpt/fb10disk2 = is the new disk: >=20 > MOS Configuration: > version: 5000 > name: 'rdnzltank' > state: 0 > txg: 1634460 > pool_guid: 5382786142589818227 > hostid: 852094392 > hostname: 'freebsd10.rdnzl.info' > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 5382786142589818227 > children[0]: > type: 'mirror' > id: 0 > guid: 6268049119730836293 > whole_disk: 0 > metaslab_array: 34 > metaslab_shift: 32 > ashift: 9 > asize: 741558452224 > is_log: 0 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 1732695434302750511 > path: '/dev/gpt/fb10disk1' > phys_path: '/dev/gpt/fb10disk1' > whole_disk: 1 > DTL: 98 > create_txg: 4 > children[1]: > type: 'disk' > id: 1 > guid: 15812067837864729710 > path: '/dev/gpt/fb10disk2' > phys_path: '/dev/gpt/fb10disk2' > whole_disk: 1 > DTL: 526 > create_txg: 4 > resilver_txg: 1634424 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data >=20 > I don=E2=80=99t think have anything in /etc/rc.conf that would have an = effect at the time when kernel tries to mount the root filesystem but = here it is: >=20 > hostname=3D"freebsd10.rdnzl.info" > keymap=3D"fi.kbd" >=20 > #cloned_interfaces=3D"lo1" > #ifconfig_vtnet0=3D"SYNCDHCP" > ifconfig_re0=3D"inet 10.71.14.12/24" > #ifconfig_re0_alias0=3D"inet 10.71.14.112/24" > defaultrouter=3D"10.71.14.1" > #gateway_enable=3D"YES" >=20 > ipv6_activate_all_interfaces=3D"YES" > #ifconfig_vtnet0_ipv6=3D"accept_rtadv" > ifconfig_re0_ipv6=3D"inet6 2001:14b8:100:ZZZZ::XXXX/64" > ipv6_defaultrouter=3D"2001:14b8:100:ZZZZ::1"=20 > #ipv6_gateway_enable=3D"YES" >=20 > #pf_enable=3D"YES" > #pflog_enable=3D"YES" > #pflog_flags=3D"-d 10 -s 256" >=20 > zfs_enable=3D"YES" >=20 > #devfs_load_rulesets=3DYES >=20 > sshd_enable=3D"YES" > # Set dumpdev to "AUTO" to enable crash dumps, "NO" to disable > dumpdev=3D"AUTO" >=20 > clear_tmp_enable=3D"YES" >=20 > sendmail_enable=3D"NO" > sendmail_submit_enable=3D"NO" > sendmail_outbound_enable=3D"NO" > sendmail_msp_queue_enable=3D"NO" >=20 > rpcbind_enable=3D"YES" > nfs_server_enable=3D"YES" > mountd_enable=3D"YES" >=20 > #nfsv4_server_enable=3D"YES" > #nfsuserd_enable=3D"YES" > #mountd_flags=3D"-r" >=20 > ntpd_enable=3D"YES" > ntpd_sync_on_start=3D"YES" >=20 > jail_enable=3D"YES" > jail_list=3D"buildstable10amd64 buildreleng100i386" >=20 > #ntpdate_enable=3D"YES" > #ntpdate_hosts=3D"10.71.14.1" >=20 > nginx_enable=3D"YES" >=20 >=20 > #mdnsresponderposix_enable=3D"YES" > mdnsresponderposix_flags=3D"-f /usr/local/etc/mDNSResponder.conf" >=20 >=20 > #openntpd_enable=3D"YES" >=20 > #avahi_daemon_enable=3D"YES" > #dbus_enable=3D"YES" > mdnsd_enable=3D"YES" >=20 > smartd_enable=3D"YES" >=20 > dma_flushq_enable=3D=E2=80=9CYES=E2=80=9D >=20 > -Kimmo >=20 Just a thought. Is my problem related to the use of GPT labeled = partitions in my pool configuration? Your testing shows just "raw" = devices like ada0p3 etc. -Kimmo