Date: Sun, 3 Nov 2013 08:57:09 -0800 From: Artem Belevich <art@freebsd.org> To: Andriy Gapon <avg@freebsd.org> Cc: "stable@freebsd.org" <stable@freebsd.org>, fs@freebsd.org Subject: Re: Can't mount root from raidz2 after r255763 in stable/9 Message-ID: <CAFqOu6hASSZJ5F-a0Svrz_5%2BG0143o6FzUJi5B_RsY9PNJ2PDw@mail.gmail.com> In-Reply-To: <5276030E.5040100@FreeBSD.org> References: <CAFqOu6jfZc5bGF4n0tLa%2BY7=UkqmbsK589o6G%2BUiP3OTdyLdTg__13033.8046853014$1383448959$gmane$org@mail.gmail.com> <5276030E.5040100@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
TL;DR; version -- Solved. The failure was caused by zombie ZFS volume labels from the previous life of the disks in another pool. For some reason kernel picks labels from the raw device first now and tries to boot from the pool that does not exist. Nuking old labels with dd solved my booting issues. On Sun, Nov 3, 2013 at 1:02 AM, Andriy Gapon <avg@freebsd.org> wrote: > on 03/11/2013 05:22 Artem Belevich said the following: >> Hi, >> >> I have a box with root mounted from 8-disk raidz2 ZFS volume. >> After recent buildworld I've ran into an issue that kernel fails to >> mount root with error 6. >> r255763 on stable/9 is the first revision that fails to mount root on >> mybox. Preceding r255749 boots fine. >> >> Commit r255763 (http://svnweb.freebsd.org/base?view=revision&revision=255763) >> MFCs bunch of changes from 10 but I don't see anything that obviously >> impacts ZFS. > > Indeed. > >> Attempting to boot with vfs.zfs.debug=1 shows that order in which geom >> providers are probed by zfs has apparently changed. Kernels that boot, >> show "guid match for provider /dev/gpt/<valid pool slice>" while >> failing kernels show "guid match for provider /dev/daX" -- the raw >> disks that are *not* the right geom provider for my pool slices. Beats >> me why ZFS picks raw disks over GPT partitions it should have. > > Perhaps the kernel gpart code fails to recognize the partitions and thus ZFS > can't see them? > >> Pool configuration: >> #zpool status z0 >> pool: z0 >> state: ONLINE >> scan: scrub repaired 0 in 8h57m with 0 errors on Sat Oct 19 20:23:52 2013 >> config: >> >> NAME STATE READ WRITE CKSUM >> z0 ONLINE 0 0 0 >> raidz2-0 ONLINE 0 0 0 >> gpt/da0p4-z0 ONLINE 0 0 0 >> gpt/da1p4-z0 ONLINE 0 0 0 >> gpt/da2p4-z0 ONLINE 0 0 0 >> gpt/da3p4-z0 ONLINE 0 0 0 >> gpt/da4p4-z0 ONLINE 0 0 0 >> gpt/da5p4-z0 ONLINE 0 0 0 >> gpt/da6p4-z0 ONLINE 0 0 0 >> gpt/da7p4-z0 ONLINE 0 0 0 >> logs >> mirror-1 ONLINE 0 0 0 >> gpt/ssd-zil-z0 ONLINE 0 0 0 >> gpt/ssd1-zil-z0 ONLINE 0 0 0 >> cache >> gpt/ssd1-l2arc-z0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> Here are screen captures from a failed boot: >> https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785 > > I don't have permission to view this album. Argh. Copy-paste error. Try these : https://plus.google.com/photos/101142993171487001774/albums/5941857781891332785?authkey=CPm-4YnarsXhKg https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785?authkey=CPm-4YnarsXhKg > >> And here's boot log from successful boot on the same system: >> http://pastebin.com/XCwebsh7 >> >> Removing ZIL and L2ARC makes no difference -- r255763 still fails to mount root. >> >> I'm thoroughly baffled. Is there's something wrong with the pool -- >> some junk metadata somewhere on the disk that now screws with the root >> mounting? Changed order in geom provider enumeration? Something else? >> Any suggestions on what I can do to debug this further? > > gpart. Long version of the story: It was stale metadata after all. 'zdb -l /dev/daN' showed that one of the four pool labels was still found on every drive in the pool. Long ago the drives were temporarily used as raw drives in a ZFS pool on a test box. Then I destroyed the pool, sliced them into partitions with GPT and used one of partitions to build current pool. Apparently not all old pool labels were overwritten by the new pool, but by accident that went unnoticed until now because new pool was detected first. Now detection order has changed (I'm still not sure how or why) and that resurrected the old non-existing pool and caused boot failures. After finding location of volume labels on the disk and nuking them with dd boot issues went away. The scary part was that the label was *inside* the current pool slice so I had to corrupt current pool data. I figured that considering that label is still alive, current pool didn't write anything there yet and therefore it should be safe to overwrite the label. I first did it on one drive only. In case I was wrong, ZFS should have been able to rebuild the pool. Lucky for me no vital data was hurt in the process and zfs scrub reported zero errors. After nuking old labels on other drives, boot issues went away. Even though my problem has been dealt with, I still wonder whether pool detection should be more robust. I've been lucky that it was kernel that changed pool detection and not the bootloader. It would've made troubleshooting even more interesting. Would it make sense to prefer partitions over whole drives? Or, perhaps prefer pools with all the labels intact over devices that only have small fraction of valid labels? --Artem
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFqOu6hASSZJ5F-a0Svrz_5%2BG0143o6FzUJi5B_RsY9PNJ2PDw>