From owner-freebsd-fs@freebsd.org Tue May 16 06:31:50 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 284E8D6FA5B; Tue, 16 May 2017 06:31:50 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from smtp.fagskolen.gjovik.no (smtp.fagskolen.gjovik.no [IPv6:2001:700:1100:1:200:ff:fe00:b]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "smtp.fagskolen.gjovik.no", Issuer "Fagskolen i Gj??vik" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id A8D8F1954; Tue, 16 May 2017 06:31:49 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from mail.fig.ol.no (localhost [127.0.0.1]) by mail.fig.ol.no (8.15.2/8.15.2) with ESMTPS id v4G6VMSk076391 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 16 May 2017 08:31:22 +0200 (CEST) (envelope-from trond@fagskolen.gjovik.no) Received: from localhost (trond@localhost) by mail.fig.ol.no (8.15.2/8.15.2/Submit) with ESMTP id v4G6VLvX076388; Tue, 16 May 2017 08:31:21 +0200 (CEST) (envelope-from trond@fagskolen.gjovik.no) X-Authentication-Warning: mail.fig.ol.no: trond owned process doing -bs Date: Tue, 16 May 2017 08:31:21 +0200 (CEST) From: =?ISO-8859-1?Q?Trond_Endrest=F8l?= Sender: Trond.Endrestol@fagskolen.gjovik.no To: Nikos Vassiliadis cc: freebsd-fs@freebsd.org, freebsd-stable@freebsd.org Subject: Re: zpool imported twice with different names (was Re: Fwd: ZFS) In-Reply-To: Message-ID: References: <7c059678-4af4-f0c9-ff3b-c6266e02fb7a@gmx.com> User-Agent: Alpine 2.21 (BSF 202 2017-01-01) Organization: Fagskolen Innlandet OpenPGP: url=http://fig.ol.no/~trond/trond.key MIME-Version: 1.0 X-Spam-Status: No, score=-2.2 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mail.fig.ol.no Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-Content-Filtered-By: Mailman/MimeDel 2.1.23 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 May 2017 06:31:50 -0000 On Mon, 15 May 2017 20:11+0200, Nikos Vassiliadis wrote: > Fix the e-mail subject > > On 05/15/2017 08:09 PM, Nikos Vassiliadis wrote: > > Hi everybody, > > > > While trying to rename a zpool from zroot to vega, > > I ended up in this strange situation: > > nik@vega:~ % zfs list -t all > > NAME USED AVAIL REFER MOUNTPOINT > > vega 1.83G 34.7G 96K /zroot > > vega/ROOT 1.24G 34.7G 96K none > > vega/ROOT/default 1.24G 34.7G 1.24G / > > vega/tmp 120K 34.7G 120K /tmp > > vega/usr 608M 34.7G 96K /usr > > vega/usr/home 136K 34.7G 136K /usr/home > > vega/usr/ports 96K 34.7G 96K /usr/ports > > vega/usr/src 607M 34.7G 607M /usr/src > > vega/var 720K 34.7G 96K /var > > vega/var/audit 96K 34.7G 96K /var/audit > > vega/var/crash 96K 34.7G 96K /var/crash > > vega/var/log 236K 34.7G 236K /var/log > > vega/var/mail 100K 34.7G 100K /var/mail > > vega/var/tmp 96K 34.7G 96K /var/tmp > > zroot 1.83G 34.7G 96K /zroot > > zroot/ROOT 1.24G 34.7G 96K none > > zroot/ROOT/default 1.24G 34.7G 1.24G / > > zroot/tmp 120K 34.7G 120K /tmp > > zroot/usr 608M 34.7G 96K /usr > > zroot/usr/home 136K 34.7G 136K /usr/home > > zroot/usr/ports 96K 34.7G 96K /usr/ports > > zroot/usr/src 607M 34.7G 607M /usr/src > > zroot/var 724K 34.7G 96K /var > > zroot/var/audit 96K 34.7G 96K /var/audit > > zroot/var/crash 96K 34.7G 96K /var/crash > > zroot/var/log 240K 34.7G 240K /var/log > > zroot/var/mail 100K 34.7G 100K /var/mail > > zroot/var/tmp 96K 34.7G 96K /var/tmp > > nik@vega:~ % zpool status > > pool: vega > > state: ONLINE > > scan: scrub repaired 0 in 0h0m with 0 errors on Mon May 15 01:28:48 2017 > > config: > > > > NAME STATE READ WRITE CKSUM > > vega ONLINE 0 0 0 > > vtbd0p3 ONLINE 0 0 0 > > > > errors: No known data errors > > > > pool: zroot > > state: ONLINE > > scan: scrub repaired 0 in 0h0m with 0 errors on Mon May 15 01:28:48 2017 > > config: > > > > NAME STATE READ WRITE CKSUM > > zroot ONLINE 0 0 0 > > vtbd0p3 ONLINE 0 0 0 > > > > errors: No known data errors > > nik@vega:~ % > > ------------------------------------------- > > > > It seems like there are two pools, sharing the same vdev... > > > > After running a few commands in this state, like doing a scrub, > > the pool was (most probably) destroyed. It couldn't boot anymore > > and I didn't research further. Is this a known bug? > > I guess you had a /boot/zfs/zpool.cache file referring to the original zroot pool. Next, the kernel found the vega pool and didn't realise these two pools are the very same. > > Steps to reproduce: > > install FreeBSD-11.0 in a pool named zroot > > reboot into a live-CD Redo the above steps. > > zpool import -f zroot vega Do these four commands instead of a regular import: mkdir /tmp/vega zpool import -N -f -o cachefile=/tmp/zpool.cache vega mount -t zfs vega/ROOT/default /tmp/vega cp -p /tmp/zpool.cache /tmp/vega/boot/zfs/zpool.cache > > reboot again Reboot again. > > > > Thanks, > > Nikos > > > > PS: > > Sorry for the cross-posting, I am doing this to share to more people > > because it is a rather easy way to destroy a ZFS pool. -- +-------------------------------+------------------------------------+ | Vennlig hilsen, | Best regards, | | Trond Endrestøl, | Trond Endrestøl, | | IT-ansvarlig, | System administrator, | | Fagskolen Innlandet, | Gjøvik Technical College, Norway, | | tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, | | sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. | +-------------------------------+------------------------------------+ From owner-freebsd-fs@freebsd.org Tue May 16 10:13:02 2017 Return-Path: Delivered-To: freebsd-fs@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id D4155D6E8FB; Tue, 16 May 2017 10:13:02 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citapm.icyb.net.ua (citapm.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 08FA21221; Tue, 16 May 2017 10:13:01 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citapm.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA04494; Tue, 16 May 2017 13:12:54 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1dAZTJ-0005IB-SI; Tue, 16 May 2017 13:12:53 +0300 Subject: Re: zfs recv panic To: Kristof Provost , Freebsd current , freebsd-fs@FreeBSD.org References: <18A74EE1-3358-4276-88EA-C13E28D8563A@sigsegv.be> From: Andriy Gapon Message-ID: <98df7d70-4ecb-34f2-7db2-d11a4b0c854a@FreeBSD.org> Date: Tue, 16 May 2017 13:11:32 +0300 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: <18A74EE1-3358-4276-88EA-C13E28D8563A@sigsegv.be> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 May 2017 10:13:02 -0000 On 10/05/2017 12:37, Kristof Provost wrote: > Hi, > > I have a reproducible panic on CURRENT (r318136) doing > (jupiter) # zfs send -R -v zroot/var@before-kernel-2017-04-26 | nc dual 1234 > (dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var > > For clarity, the receiving machine is CURRENT r318136, the sending machine is > running a somewhat older CURRENT version. > > The receiving machine panics a few seconds in: > > receiving full stream of zroot/var@before-kernel-2017-04-03 into > tank/jupiter/var@before-kernel-2017-04-03 > panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) (0x0 == > 0x1), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c, > line: 2007 Kristof, could you please try to revert commits related to the compressed send and see if that helps? I assume that the sending machine does not have (does not use) the feature while the target machine is capable of the feature. The commits are: r317648 and r317414. Mot that I really suspect that change, but just to eliminate the possibility. Thank you. > cpuid = 0 > time = 1494408122 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0120cad930 > vpanic() at vpanic+0x19c/frame 0xfffffe0120cad9b0 > panic() at panic+0x43/frame 0xfffffe0120cada10 > assfail3() at assfail3+0x2c/frame 0xfffffe0120cada30 > dbuf_assign_arcbuf() at dbuf_assign_arcbuf+0xf2/frame 0xfffffe0120cada80 > dmu_assign_arcbuf() at dmu_assign_arcbuf+0x170/frame 0xfffffe0120cadad0 > receive_writer_thread() at receive_writer_thread+0x6ac/frame 0xfffffe0120cadb70 > fork_exit() at fork_exit+0x84/frame 0xfffffe0120cadbb0 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0120cadbb0 > --- trap 0, rip = 0, rsp = 0, rbp = 0 --- > KDB: enter: panic > [ thread pid 7 tid 100672 ] > Stopped at kdb_enter+0x3b: movq $0,kdb_why > db> > > > kgdb backtrace: > #0 doadump (textdump=0) at pcpu.h:232 > #1 0xffffffff803a208b in db_dump (dummy=, dummy2= optimized out>, dummy3=, dummy4=) at > /usr/src/sys/ddb/db_command.c:546 > #2 0xffffffff803a1e7f in db_command (cmd_table=) at > /usr/src/sys/ddb/db_command.c:453 > #3 0xffffffff803a1bb4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:506 > #4 0xffffffff803a4c7f in db_trap (type=, code= optimized out>) at /usr/src/sys/ddb/db_main.c:248 > #5 0xffffffff80a93cb3 in kdb_trap (type=3, code=-61456, tf= out>) at /usr/src/sys/kern/subr_kdb.c:654 > #6 0xffffffff80ed3de6 in trap (frame=0xfffffe0120cad860) at > /usr/src/sys/amd64/amd64/trap.c:537 > #7 0xffffffff80eb62f1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236 > #8 0xffffffff80a933eb in kdb_enter (why=0xffffffff8143d8f5 "panic", msg= optimized out>) at cpufunc.h:63 > #9 0xffffffff80a51cf9 in vpanic (fmt=, > ap=0xfffffe0120cad9f0) at /usr/src/sys/kern/kern_shutdown.c:772 > #10 0xffffffff80a51d63 in panic (fmt=) at > /usr/src/sys/kern/kern_shutdown.c:710 > #11 0xffffffff8262b26c in assfail3 (a=, lv= out>, op=, rv=, f= out>, l=) > at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91 > #12 0xffffffff822ad892 in dbuf_assign_arcbuf (db=0xfffff8008f23e560, > buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2007 > #13 0xffffffff822b87f0 in dmu_assign_arcbuf (handle=, > offset=0, buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1542 > #14 0xffffffff822bf7fc in receive_writer_thread (arg=0xfffffe0120a1d168) at > /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:2284 > #15 0xffffffff80a13704 in fork_exit (callout=0xffffffff822bf150 > , arg=0xfffffe0120a1d168, frame=0xfffffe0120cadbc0) at > /usr/src/sys/kern/kern_fork.c:1038 > #16 0xffffffff80eb682e in fork_trampoline () at > /usr/src/sys/amd64/amd64/exception.S:611 > #17 0x0000000000000000 in ?? () > > Let me know if there’s any other information I can provide, or things I can test. > Fortunately the target machine is not a production machine, so I can panic it as > often as required. -- Andriy Gapon