From owner-freebsd-stable@FreeBSD.ORG Sun Nov 3 16:57:10 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id E0D45DC7; Sun, 3 Nov 2013 16:57:10 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vc0-x22d.google.com (mail-vc0-x22d.google.com [IPv6:2607:f8b0:400c:c03::22d]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7C4E92756; Sun, 3 Nov 2013 16:57:10 +0000 (UTC) Received: by mail-vc0-f173.google.com with SMTP id lh4so4138168vcb.32 for ; Sun, 03 Nov 2013 08:57:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=HLzOy22XTehLapNhcuUAjHZoKvV2y2VC3uJ4IoUf8NA=; b=RBzJ6zWpR7fU7ap4h3nWJ721m1mbxdTGQ3IRSQxJWh/2iwuOsRKOLVg7X0yGJFCdLQ EN5Y7WS9JGGNLCBlW5E424WxepgUOpx7360fFr/o0HJUVCecCZ5WFDkNVQiBCb0rCCet ncd4gzsmNvTfr+m28Etmorz5mHy++Hl6qPPthlhBGq6+uZpKvL3uIKKuQXlhpQqdAkEc 8j8RmwQXqaGql+ky1Xj1elqqqeT4lc8tYd2pBry5XzfBdFCNnActXAmpnjgUouCQQJnb NWpUhdht4gJpSYhfdMVjOq5t4DkkeFJvwd1LN2Y6whpBH5fZ8F349mIBCEbrM2fkueOW wejQ== MIME-Version: 1.0 X-Received: by 10.52.230.202 with SMTP id ta10mr25582vdc.41.1383497829542; Sun, 03 Nov 2013 08:57:09 -0800 (PST) Sender: artemb@gmail.com Received: by 10.221.9.2 with HTTP; Sun, 3 Nov 2013 08:57:09 -0800 (PST) In-Reply-To: <5276030E.5040100@FreeBSD.org> References: <5276030E.5040100@FreeBSD.org> Date: Sun, 3 Nov 2013 08:57:09 -0800 X-Google-Sender-Auth: iq9FehKtcph_4Wxfmd_-UJ_AA2M Message-ID: Subject: Re: Can't mount root from raidz2 after r255763 in stable/9 From: Artem Belevich To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: "stable@freebsd.org" , fs@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 03 Nov 2013 16:57:11 -0000 TL;DR; version -- Solved. The failure was caused by zombie ZFS volume labels from the previous life of the disks in another pool. For some reason kernel picks labels from the raw device first now and tries to boot from the pool that does not exist. Nuking old labels with dd solved my booting issues. On Sun, Nov 3, 2013 at 1:02 AM, Andriy Gapon wrote: > on 03/11/2013 05:22 Artem Belevich said the following: >> Hi, >> >> I have a box with root mounted from 8-disk raidz2 ZFS volume. >> After recent buildworld I've ran into an issue that kernel fails to >> mount root with error 6. >> r255763 on stable/9 is the first revision that fails to mount root on >> mybox. Preceding r255749 boots fine. >> >> Commit r255763 (http://svnweb.freebsd.org/base?view=revision&revision=255763) >> MFCs bunch of changes from 10 but I don't see anything that obviously >> impacts ZFS. > > Indeed. > >> Attempting to boot with vfs.zfs.debug=1 shows that order in which geom >> providers are probed by zfs has apparently changed. Kernels that boot, >> show "guid match for provider /dev/gpt/" while >> failing kernels show "guid match for provider /dev/daX" -- the raw >> disks that are *not* the right geom provider for my pool slices. Beats >> me why ZFS picks raw disks over GPT partitions it should have. > > Perhaps the kernel gpart code fails to recognize the partitions and thus ZFS > can't see them? > >> Pool configuration: >> #zpool status z0 >> pool: z0 >> state: ONLINE >> scan: scrub repaired 0 in 8h57m with 0 errors on Sat Oct 19 20:23:52 2013 >> config: >> >> NAME STATE READ WRITE CKSUM >> z0 ONLINE 0 0 0 >> raidz2-0 ONLINE 0 0 0 >> gpt/da0p4-z0 ONLINE 0 0 0 >> gpt/da1p4-z0 ONLINE 0 0 0 >> gpt/da2p4-z0 ONLINE 0 0 0 >> gpt/da3p4-z0 ONLINE 0 0 0 >> gpt/da4p4-z0 ONLINE 0 0 0 >> gpt/da5p4-z0 ONLINE 0 0 0 >> gpt/da6p4-z0 ONLINE 0 0 0 >> gpt/da7p4-z0 ONLINE 0 0 0 >> logs >> mirror-1 ONLINE 0 0 0 >> gpt/ssd-zil-z0 ONLINE 0 0 0 >> gpt/ssd1-zil-z0 ONLINE 0 0 0 >> cache >> gpt/ssd1-l2arc-z0 ONLINE 0 0 0 >> >> errors: No known data errors >> >> Here are screen captures from a failed boot: >> https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785 > > I don't have permission to view this album. Argh. Copy-paste error. Try these : https://plus.google.com/photos/101142993171487001774/albums/5941857781891332785?authkey=CPm-4YnarsXhKg https://plus.google.com/photos/+ArtemBelevich/albums/5941857781891332785?authkey=CPm-4YnarsXhKg > >> And here's boot log from successful boot on the same system: >> http://pastebin.com/XCwebsh7 >> >> Removing ZIL and L2ARC makes no difference -- r255763 still fails to mount root. >> >> I'm thoroughly baffled. Is there's something wrong with the pool -- >> some junk metadata somewhere on the disk that now screws with the root >> mounting? Changed order in geom provider enumeration? Something else? >> Any suggestions on what I can do to debug this further? > > gpart. Long version of the story: It was stale metadata after all. 'zdb -l /dev/daN' showed that one of the four pool labels was still found on every drive in the pool. Long ago the drives were temporarily used as raw drives in a ZFS pool on a test box. Then I destroyed the pool, sliced them into partitions with GPT and used one of partitions to build current pool. Apparently not all old pool labels were overwritten by the new pool, but by accident that went unnoticed until now because new pool was detected first. Now detection order has changed (I'm still not sure how or why) and that resurrected the old non-existing pool and caused boot failures. After finding location of volume labels on the disk and nuking them with dd boot issues went away. The scary part was that the label was *inside* the current pool slice so I had to corrupt current pool data. I figured that considering that label is still alive, current pool didn't write anything there yet and therefore it should be safe to overwrite the label. I first did it on one drive only. In case I was wrong, ZFS should have been able to rebuild the pool. Lucky for me no vital data was hurt in the process and zfs scrub reported zero errors. After nuking old labels on other drives, boot issues went away. Even though my problem has been dealt with, I still wonder whether pool detection should be more robust. I've been lucky that it was kernel that changed pool detection and not the bootloader. It would've made troubleshooting even more interesting. Would it make sense to prefer partitions over whole drives? Or, perhaps prefer pools with all the labels intact over devices that only have small fraction of valid labels? --Artem