From owner-freebsd-stable@FreeBSD.ORG Thu Sep 11 00:55:30 2014 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 15D12C34 for ; Thu, 11 Sep 2014 00:55:30 +0000 (UTC) Received: from st11p09mm-asmtp001.mac.com (st11p09mm-asmtp001.mac.com [17.164.24.96]) (using TLSv1 with cipher DES-CBC3-SHA (112/168 bits)) (Client CN "smtp.me.com", Issuer "VeriSign Class 3 Extended Validation SSL SGC CA" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D2AB4349 for ; Thu, 11 Sep 2014 00:55:29 +0000 (UTC) Received: from [10.71.14.16] (dsl-hkibrasgw1-58c380-33.dhcp.inet.fi [88.195.128.33]) by st11p09mm-asmtp001.mac.com (Oracle Communications Messaging Server 7u4-27.10(7.0.4.27.9) 64bit (built Jun 6 2014)) with ESMTPSA id <0NBP00H1YP8DVL30@st11p09mm-asmtp001.mac.com> for freebsd-stable@freebsd.org; Thu, 11 Sep 2014 00:55:28 +0000 (GMT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.12.52,1.0.28,0.0.0000 definitions=2014-09-10_04:2014-09-09,2014-09-10,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=52 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1409110002 Subject: Re: ZFS on root booting broken somewhere after r270020 MIME-version: 1.0 (Mac OS X Mail 8.0 \(1973.6\)) Content-type: text/plain; charset=utf-8 From: Kimmo Paasiala X-Priority: 3 In-reply-to: Date: Thu, 11 Sep 2014 03:55:23 +0300 Content-transfer-encoding: quoted-printable Message-id: References: <51AD1F36-1089-481F-8784-8BD8E6EF020F@icloud.com> <71DEB316-3CDD-4403-A397-BCE684725ABD@icloud.com> <25886C53-39C1-47A8-95F7-494FA6E7ABA2@icloud.com> <20140819071045.GS2737@kib.kiev.ua> <99FB0662-1954-4ECB-939B-06D0AA49C1A1@icloud.com> <20140819074643.GU2737@kib.kiev.ua> <7F008C560B48412AB66A1EBD9382DDAE@multiplay.co.uk> <9315C209-701A-49EF-85D3-ACCCD1513EC3@icloud.com> <959C54D2C8EB4AC8983DC1DA3CE042E3@multiplay.co.uk> <9F24DD48FBEA46C39F98DF600D46DA1A@multiplay.co.uk> To: Steven Hartland X-Mailer: Apple Mail (2.1973.6) Cc: "freebsd-stable@freebsd.org" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 11 Sep 2014 00:55:30 -0000 > On 11.9.2014, at 3.52, Steven Hartland = wrote: >=20 >=20 > ----- Original Message ----- From: "Kimmo Paasiala" = > To: "Steven Hartland" > Cc: > Sent: Thursday, September 11, 2014 1:04 AM > Subject: Re: ZFS on root booting broken somewhere after r270020 >=20 >=20 >=20 >> On 11.9.2014, at 2.41, Steven Hartland = wrote: >>=20 >>=20 >> ----- Original Message ----- From: "Steven Hartland" = >> To: "Kimmo Paasiala" >> Cc: >> Sent: Wednesday, September 10, 2014 11:36 PM >> Subject: Re: ZFS on root booting broken somewhere after r270020 >>=20 >>=20 >>>=20 >>> ----- Original Message ----- From: "Kimmo Paasiala" = >>> To: "Steven Hartland" >>> Cc: >>> Sent: Wednesday, September 10, 2014 8:26 PM >>> Subject: Re: ZFS on root booting broken somewhere after r270020 >>>=20 >>>=20 >>>>=20 >>>>> On 9.9.2014, at 19.03, Kimmo Paasiala wrote: >>>>>=20 >>>>>=20 >>>>>> On 9.9.2014, at 18.53, Steven Hartland = wrote: >>>>>>=20 >>>>>> ----- Original Message ----- From: "Kimmo Paasiala" = >>>>>>> Hi it=E2=80=99s me again. Something that was committed in = stable/10 after r271213 up to >>>>>>> and including r271288 broke ZFS on Root booting in exactly the = same way again. >>>>>>> I know the problem is no longer related to extra kernel modules = loaded in >>>>>>> /boot/loader.conf because I=E2=80=99m loading only the required = zfs.ko and opensolaris.ko >>>>>>> modules. Also, the new vt(4) console that I=E2=80=99m using is = not the culprit because the >>>>>>> same thing happens with kern.vty set to =E2=80=9Csc=E2=80=9D. >>>>>>=20 >>>>>> I've just updated my stable/10 box to r271316 and no problems = booting from a ZFS root. >>>>>>=20 >>>>>> So first things first what error are you seeing? >>>>>>=20 >>>>>> Next what is you're: >>>>>> * Hardware >>>>>> * Pool layout >>>>>>=20 >>>>>> Regards >>>>>> Steve >>>>>=20 >>>>> The error is the same as before: >>>>>=20 >>>>> =E2=80=A2 Mounting from zfs:rdnzltank/ROOT/default failed with = error 5. >>>>>=20 >>>>> Followed by the mountroot prompt and I get only these devices to = choose from, no sign of the ZFS pool: >>>>>=20 >>>>> =E2=80=A2 mountroot> >>>>> =E2=80=A2 List of GEOM managed disk devices: >>>>> =E2=80=A2 gpt/fb10disk1 gpt/fb10swap1 = diskid/DISK-S13UJDWS301624p3 diskid/DISK-S13UJDWS301624p2 = diskid/DISK-S13UJDWS301624p1 ada0p3 ada0p2 ada0p1 = diskid/DISK-S13UJDWS301624 ada0 >>>>>=20 >>>>> Hardware is a Gigabyte GA-D510UD Mini-ITX motherboard: >>>>>=20 >>>>> http://www.gigabyte.com/products/product-page.aspx?pid=3D3343#ov >>>>>=20 >>>>> 4GBs of RAM. One 750GB Samsung HD753LJ 3.5=E2=80=9D SATA HD on the = Intel SATA controller. >>>>>=20 >>>>> Pool layout: >>>>>=20 >>>>> pool: rdnzltank >>>>> state: ONLINE >>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 = 09:27:48 2014 >>>>> config: >>>>>=20 >>>>> NAME STATE READ WRITE CKSUM >>>>> rdnzltank ONLINE 0 0 0 >>>>> gpt/fb10disk1 ONLINE 0 0 0 >>>>>=20 >>>>> errors: No known data errors >>>>>=20 >>>>> Output of =E2=80=98gpart show=E2=80=99: >>>>>=20 >>>>> freebsd10 ~ % gpart show >>>>> =3D> 34 1465146988 ada0 GPT (699G) >>>>> 34 2014 - free - (1.0M) >>>>> 2048 1024 1 freebsd-boot (512K) >>>>> 3072 1024 - free - (512K) >>>>> 4096 16777216 2 freebsd-swap (8.0G) >>>>> 16781312 1448365710 3 freebsd-zfs (691G) >>>>>=20 >>>>>=20 >>>>> HTH, >>>>>=20 >>>>> -Kimmo >>>>=20 >>>>=20 >>>> More information. This version still works: >>>>=20 >>>> FreeBSD freebsd10.rdnzl.info 10.1-PRERELEASE FreeBSD = 10.1-PRERELEASE #0 r271237: Wed Sep 10 11:00:15 EEST 2014 = root@buildstable10amd64.rdnzl.info:/usr/obj/usr/src/sys/GENERIC amd64 >>>>=20 >>>> The next higher version r271238 breaks booting for me. The commit = in question is this one: >>>>=20 >>>> = http://svnweb.freebsd.org/base?view=3Drevision&sortby=3Drev&sortdir=3Ddown= &revision=3D271238 >>>=20 >>> Investigating, had no reports of issues while this has been in head. >>=20 >> I've just installed a stable/10 kernel, specifically: >> 10.1-PRERELEASE FreeBSD 10.1-PRERELEASE #11 r271316M >>=20 >> and booted fine from a mirrored root without issue: >> config: >>=20 >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> mirror-0 ONLINE 0 0 0 >> ada0p3 ONLINE 0 0 0 >> ada2p3 ONLINE 0 0 0 >>=20 >> gpart show ada0 ada2 >> =3D> 34 250069613 ada0 GPT (119G) >> 34 128 1 freebsd-boot (64K) >> 162 8388608 2 freebsd-swap (4.0G) >> 8388770 241680877 3 freebsd-zfs (115G) >>=20 >> =3D> 40 586072288 ada2 GPT (279G) >> 40 128 1 freebsd-boot (64K) >> 168 8388608 2 freebsd-swap (4.0G) >> 8388776 577683552 3 freebsd-zfs (275G) >>=20 >> I then detached the second disk so the machine had just: >> config: >>=20 >> NAME STATE READ WRITE CKSUM >> tank ONLINE 0 0 0 >> ada0p3 ONLINE 0 0 0 >>=20 >> Rebooted and again all fine no issues >>=20 >> I've also got a raidz1 box on the same kernel it too is fine. >>=20 >> =3D> 34 500118125 ada0 GPT (238G) >> 34 128 1 freebsd-boot (64K) >> 162 500117997 2 freebsd-zfs (238G) >> ... >>=20 >> So its seems like there's something odd about your environment, = especially >> given you've had a similar issue before. >>=20 >> So the questions: >> 1. What does zpool get all report? >> 2. What does /boot/loader.conf have in it? >> 3. What does zdb -C rdnzltank report? >> 4. What does /etc/rc.conf have in it? >>=20 >> Regards >> Steve >=20 > Here goes: > snip... >=20 > The next is now with the second disk being resilvered, gpt/fb10disk2 = is the new disk: >=20 > MOS Configuration: > version: 5000 > name: 'rdnzltank' > state: 0 > txg: 1634460 > pool_guid: 5382786142589818227 > hostid: 852094392 > hostname: 'freebsd10.rdnzl.info' > vdev_children: 1 > vdev_tree: > type: 'root' > id: 0 > guid: 5382786142589818227 > children[0]: > type: 'mirror' > id: 0 > guid: 6268049119730836293 > whole_disk: 0 > metaslab_array: 34 > metaslab_shift: 32 > ashift: 9 > asize: 741558452224 > is_log: 0 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 1732695434302750511 > path: '/dev/gpt/fb10disk1' > phys_path: '/dev/gpt/fb10disk1' > whole_disk: 1 > DTL: 98 > create_txg: 4 > children[1]: > type: 'disk' > id: 1 > guid: 15812067837864729710 > path: '/dev/gpt/fb10disk2' > phys_path: '/dev/gpt/fb10disk2' > whole_disk: 1 > DTL: 526 > create_txg: 4 > resilver_txg: 1634424 > features_for_read: > com.delphix:hole_birth > com.delphix:embedded_data >=20 > Ok this could show your problem ^^ >=20 > In a previous post your said >>>>> pool: rdnzltank >>>>> state: ONLINE >>>>> scan: scrub repaired 0 in 1h7m with 0 errors on Wed Aug 20 = 09:27:48 2014 >>>>> config: >>>>>=20 >>>>> NAME STATE READ WRITE CKSUM >>>>> rdnzltank ONLINE 0 0 0 >>>>> gpt/fb10disk1 ONLINE 0 0 0 >=20 > But zdb thinks your pool is a mirror which I believe indicates that = your pool's real > config is out of sync with the cache file. >=20 > Now this shouldn't cause an issue as it should just try all devices in = order until it > succeeds but there may be an issue there somewhere. >=20 > Could you:- > 1. backup your cache file > cp /boot/zfs/zpool.cache /boot/zfs/zpool.cache.old > 2. regenerate your cache file > zpool set cachefile=3D/boot/zfs/zpool.cache tank > 3. rerun the zdb command and let us know the output > zdb -C rdnzltank > I'm hoping that it should show: > ... > vdev_tree: > type: 'root' > id: 0 > guid: 5382786142589818227 > children[0]: > type: 'disk' > .. > 4. If it does show type 'disk' try rebooting with the new kernel. >=20 > Regards > Steve=20 Yes, as I said I=E2=80=99m right now trying to see if the pool would = work as a mirror with the newer kernel that somehow broke booting for = me. I=E2=80=99m in the middle of a resilver that keeps restarting for = some odd reason every 15 minutes=E2=80=A6 -Kimmo