Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 May 2017 08:31:21 +0200 (CEST)
From:      =?ISO-8859-1?Q?Trond_Endrest=F8l?= <Trond.Endrestol@fagskolen.gjovik.no>
To:        Nikos Vassiliadis <nvass@gmx.com>
Cc:        freebsd-fs@freebsd.org, freebsd-stable@freebsd.org
Subject:   Re: zpool imported twice with different names (was Re: Fwd: ZFS)
Message-ID:  <alpine.BSF.2.21.1705160825130.40966@mail.fig.ol.no>
In-Reply-To: <ca7b47a7-7512-3cbb-d47b-6ef546dffd74@gmx.com>
References:  <7c059678-4af4-f0c9-ff3b-c6266e02fb7a@gmx.com> <adf4ab9f-72f1-ed0f-fee2-82caba3af4a4@gmx.com> <ca7b47a7-7512-3cbb-d47b-6ef546dffd74@gmx.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 15 May 2017 20:11+0200, Nikos Vassiliadis wrote:

> Fix the e-mail subject
> 
> On 05/15/2017 08:09 PM, Nikos Vassiliadis wrote:
> > Hi everybody,
> > 
> > While trying to rename a zpool from zroot to vega,
> > I ended up in this strange situation:
> > nik@vega:~ % zfs list -t all
> > NAME                 USED  AVAIL  REFER  MOUNTPOINT
> > vega                1.83G  34.7G    96K  /zroot
> > vega/ROOT           1.24G  34.7G    96K  none
> > vega/ROOT/default   1.24G  34.7G  1.24G  /
> > vega/tmp             120K  34.7G   120K  /tmp
> > vega/usr             608M  34.7G    96K  /usr
> > vega/usr/home        136K  34.7G   136K  /usr/home
> > vega/usr/ports        96K  34.7G    96K  /usr/ports
> > vega/usr/src         607M  34.7G   607M  /usr/src
> > vega/var             720K  34.7G    96K  /var
> > vega/var/audit        96K  34.7G    96K  /var/audit
> > vega/var/crash        96K  34.7G    96K  /var/crash
> > vega/var/log         236K  34.7G   236K  /var/log
> > vega/var/mail        100K  34.7G   100K  /var/mail
> > vega/var/tmp          96K  34.7G    96K  /var/tmp
> > zroot               1.83G  34.7G    96K  /zroot
> > zroot/ROOT          1.24G  34.7G    96K  none
> > zroot/ROOT/default  1.24G  34.7G  1.24G  /
> > zroot/tmp            120K  34.7G   120K  /tmp
> > zroot/usr            608M  34.7G    96K  /usr
> > zroot/usr/home       136K  34.7G   136K  /usr/home
> > zroot/usr/ports       96K  34.7G    96K  /usr/ports
> > zroot/usr/src        607M  34.7G   607M  /usr/src
> > zroot/var            724K  34.7G    96K  /var
> > zroot/var/audit       96K  34.7G    96K  /var/audit
> > zroot/var/crash       96K  34.7G    96K  /var/crash
> > zroot/var/log        240K  34.7G   240K  /var/log
> > zroot/var/mail       100K  34.7G   100K  /var/mail
> > zroot/var/tmp         96K  34.7G    96K  /var/tmp
> > nik@vega:~ % zpool status
> >    pool: vega
> >   state: ONLINE
> >    scan: scrub repaired 0 in 0h0m with 0 errors on Mon May 15 01:28:48 2017
> > config:
> > 
> >      NAME        STATE     READ WRITE CKSUM
> >      vega        ONLINE       0     0     0
> >        vtbd0p3   ONLINE       0     0     0
> > 
> > errors: No known data errors
> > 
> >    pool: zroot
> >   state: ONLINE
> >    scan: scrub repaired 0 in 0h0m with 0 errors on Mon May 15 01:28:48 2017
> > config:
> > 
> >      NAME        STATE     READ WRITE CKSUM
> >      zroot       ONLINE       0     0     0
> >        vtbd0p3   ONLINE       0     0     0
> > 
> > errors: No known data errors
> > nik@vega:~ %
> > -------------------------------------------
> > 
> > It seems like there are two pools, sharing the same vdev...
> > 
> > After running a few commands in this state, like doing a scrub,
> > the pool was (most probably) destroyed. It couldn't boot anymore
> > and I didn't research further. Is this a known bug?
> > 

I guess you had a /boot/zfs/zpool.cache file referring to the original 
zroot pool. Next, the kernel found the vega pool and didn't realise 
these two pools are the very same.

> > Steps to reproduce:
> >    install FreeBSD-11.0 in a pool named zroot
> >    reboot into a live-CD

Redo the above steps.

> >    zpool import -f zroot vega

Do these four commands instead of a regular import:

mkdir /tmp/vega
zpool import -N -f -o cachefile=/tmp/zpool.cache vega
mount -t zfs vega/ROOT/default /tmp/vega
cp -p /tmp/zpool.cache /tmp/vega/boot/zfs/zpool.cache

> >    reboot again

Reboot again.

> > 
> > Thanks,
> > Nikos
> > 
> > PS:
> > Sorry for the cross-posting, I am doing this to share to more people
> > because it is a rather easy way to destroy a ZFS pool.

-- 
+-------------------------------+------------------------------------+
| Vennlig hilsen,               | Best regards,                      |
| Trond Endrestøl,              | Trond Endrestøl,                   |
| IT-ansvarlig,                 | System administrator,              |
| Fagskolen Innlandet,          | Gjøvik Technical College, Norway,  |
| tlf. mob.   952 62 567,       | Cellular...: +47 952 62 567,       |
| sentralbord 61 14 54 00.      | Switchboard: +47 61 14 54 00.      |
+-------------------------------+------------------------------------+
From owner-freebsd-fs@freebsd.org  Tue May 16 10:13:02 2017
Return-Path: <owner-freebsd-fs@freebsd.org>
Delivered-To: freebsd-fs@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org
 [IPv6:2001:1900:2254:206a::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4155D6E8FB;
 Tue, 16 May 2017 10:13:02 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140])
 by mx1.freebsd.org (Postfix) with ESMTP id 08FA21221;
 Tue, 16 May 2017 10:13:01 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua
 [212.40.38.100])
 by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA04494;
 Tue, 16 May 2017 13:12:54 +0300 (EEST)
 (envelope-from avg@FreeBSD.org)
Received: from localhost ([127.0.0.1])
 by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
 id 1dAZTJ-0005IB-SI; Tue, 16 May 2017 13:12:53 +0300
Subject: Re: zfs recv panic
To: Kristof Provost <kristof@sigsegv.be>,
 Freebsd current <freebsd-current@FreeBSD.org>, freebsd-fs@FreeBSD.org
References: <18A74EE1-3358-4276-88EA-C13E28D8563A@sigsegv.be>
From: Andriy Gapon <avg@FreeBSD.org>
Message-ID: <98df7d70-4ecb-34f2-7db2-d11a4b0c854a@FreeBSD.org>
Date: Tue, 16 May 2017 13:11:32 +0300
User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101
 Thunderbird/52.1.0
MIME-Version: 1.0
In-Reply-To: <18A74EE1-3358-4276-88EA-C13E28D8563A@sigsegv.be>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 8bit
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs/>;
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
 <mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 16 May 2017 10:13:02 -0000

On 10/05/2017 12:37, Kristof Provost wrote:
> Hi,
> 
> I have a reproducible panic on CURRENT (r318136) doing
> (jupiter) # zfs send -R -v zroot/var@before-kernel-2017-04-26 | nc dual 1234
> (dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var
> 
> For clarity, the receiving machine is CURRENT r318136, the sending machine is
> running a somewhat older CURRENT version.
> 
> The receiving machine panics a few seconds in:
> 
> receiving full stream of zroot/var@before-kernel-2017-04-03 into
> tank/jupiter/var@before-kernel-2017-04-03
> panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) (0x0 ==
> 0x1), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c,
> line: 2007

Kristof,

could you please try to revert commits related to the compressed send and see if
that helps?  I assume that the sending machine does not have (does not use) the
feature while the target machine is capable of the feature.

The commits are: r317648 and r317414.  Mot that I really suspect that change,
but just to eliminate the possibility.
Thank you.

> cpuid = 0
> time = 1494408122
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0120cad930
> vpanic() at vpanic+0x19c/frame 0xfffffe0120cad9b0
> panic() at panic+0x43/frame 0xfffffe0120cada10
> assfail3() at assfail3+0x2c/frame 0xfffffe0120cada30
> dbuf_assign_arcbuf() at dbuf_assign_arcbuf+0xf2/frame 0xfffffe0120cada80
> dmu_assign_arcbuf() at dmu_assign_arcbuf+0x170/frame 0xfffffe0120cadad0
> receive_writer_thread() at receive_writer_thread+0x6ac/frame 0xfffffe0120cadb70
> fork_exit() at fork_exit+0x84/frame 0xfffffe0120cadbb0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0120cadbb0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 7 tid 100672 ]
> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
> db>
> 
> 
> kgdb backtrace:
> #0  doadump (textdump=0) at pcpu.h:232
> #1  0xffffffff803a208b in db_dump (dummy=<value optimized out>, dummy2=<value
> optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>) at
> /usr/src/sys/ddb/db_command.c:546
> #2  0xffffffff803a1e7f in db_command (cmd_table=<value optimized out>) at
> /usr/src/sys/ddb/db_command.c:453
> #3  0xffffffff803a1bb4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:506
> #4  0xffffffff803a4c7f in db_trap (type=<value optimized out>, code=<value
> optimized out>) at /usr/src/sys/ddb/db_main.c:248
> #5  0xffffffff80a93cb3 in kdb_trap (type=3, code=-61456, tf=<value optimized
> out>) at /usr/src/sys/kern/subr_kdb.c:654
> #6  0xffffffff80ed3de6 in trap (frame=0xfffffe0120cad860) at
> /usr/src/sys/amd64/amd64/trap.c:537
> #7  0xffffffff80eb62f1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
> #8  0xffffffff80a933eb in kdb_enter (why=0xffffffff8143d8f5 "panic", msg=<value
> optimized out>) at cpufunc.h:63
> #9  0xffffffff80a51cf9 in vpanic (fmt=<value optimized out>,
> ap=0xfffffe0120cad9f0) at /usr/src/sys/kern/kern_shutdown.c:772
> #10 0xffffffff80a51d63 in panic (fmt=<value optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:710
> #11 0xffffffff8262b26c in assfail3 (a=<value optimized out>, lv=<value optimized
> out>, op=<value optimized out>, rv=<value optimized out>, f=<value optimized
> out>, l=<value optimized out>)
>     at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
> #12 0xffffffff822ad892 in dbuf_assign_arcbuf (db=0xfffff8008f23e560,
> buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2007
> #13 0xffffffff822b87f0 in dmu_assign_arcbuf (handle=<value optimized out>,
> offset=0, buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1542
> #14 0xffffffff822bf7fc in receive_writer_thread (arg=0xfffffe0120a1d168) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:2284
> #15 0xffffffff80a13704 in fork_exit (callout=0xffffffff822bf150
> <receive_writer_thread>, arg=0xfffffe0120a1d168, frame=0xfffffe0120cadbc0) at
> /usr/src/sys/kern/kern_fork.c:1038
> #16 0xffffffff80eb682e in fork_trampoline () at
> /usr/src/sys/amd64/amd64/exception.S:611
> #17 0x0000000000000000 in ?? ()
> 
> Let me know if there’s any other information I can provide, or things I can test.
> Fortunately the target machine is not a production machine, so I can panic it as
> often as required.

-- 
Andriy Gapon



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.21.1705160825130.40966>