From owner-freebsd-fs@FreeBSD.ORG Fri Jul 31 09:05:26 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE7FF106566B; Fri, 31 Jul 2009 09:05:26 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 3699B8FC0A; Fri, 31 Jul 2009 09:05:26 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:43858 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1MWo2z-0007k4-4T; Fri, 31 Jul 2009 11:05:12 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 3848432BFC; Fri, 31 Jul 2009 11:05:03 +0200 (CEST) Message-Id: From: Thomas Backman To: Thomas Backman In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 31 Jul 2009 11:05:01 +0200 References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4 A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca! pe.org> <4A71BED8.7050300@freebsd.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MWo2z-0007k4-4T. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MWo2z-0007k4-4T 7f1a4434528ae9fc72d402c409330e44 Cc: freebsd-fs@freebsd.org, FreeBSD current , Pawel Jakub Dawidek , Andriy Gapon Subject: Re: zfs: Fatal trap 12: page fault while in kernel mode X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 Jul 2009 09:05:27 -0000 On Jul 30, 2009, at 20:29, Thomas Backman wrote: > On Jul 30, 2009, at 18:41, Thomas Backman wrote: > >> On Jul 30, 2009, at 17:40, Andriy Gapon wrote: >>> on 30/07/2009 18:25 Thomas Backman said the following: >>>> PS. I'll test Pawel's patch sometime after dinner. ;) >>> >>> I believe that you should get a perfect result with it. >>> >>> -- Andriy Gapon >> If I dare say it, you were right! I've been testing for about half >> an hour or so (probably a bit more) now. >> Still using DEBUG_VFS_LOCKS, and I've tried the test case several >> times, ran an initial backup (i.e. destroy target pool and send| >> recv the entire pool) and a few incrementals. Rebooted, tried it >> again. No panic, no problems! :) >> Let's hope it stays this way. >> >> So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch >> ) and the libzfs patch linked previously, it appears zfs send/recv >> works plain fine. I have yet to try it with clone/promote and >> stuff, but since that gave the same panic that this solved, I'm >> hoping there will be no problems with that anymore. > > Arrrgh! > I guess I spoke too soon after all... new panic yet again. :( > *sigh* It feels as if this will never become stable right now. > (Maybe that's because I've spent all day and most of yesterday too > on this ;) > > Steps and panic info: > > (Prior to this, I tried a simple zfs promote on one of my clones, > and then reverted it by promoting the other FS again, with no > problems on running the backup script.) > > [root@chaos ~]# zfs destroy -r tank/testfs > [root@chaos ~]# bash backup.sh backup > (all output is from zfs, on zfs send -R -I old tank@new | zfs recv - > Fvd slave) > > attempting destroy slave/testfs@backup-20090730-2009 > success > attempting destroy slave/testfs@backup-20090730-1823 > success > attempting destroy slave/testfs@backup-20090730-1801 > success > attempting destroy slave/testfs@backup-20090730-2011 > success > attempting destroy slave/testfs@backup-20090730-1827 > success > attempting destroy slave/testfs > success > receiving incremental stream of tank@backup-20090730-2012 into > slave@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/tmp@backup-20090730-2012 into > slave/tmp@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/var@backup-20090730-2012 into > slave/var@backup-20090730-2012 > received 32.6KB stream in 1 seconds (32.6KB/sec) > receiving incremental stream of tank/var/log@backup-20090730-2012 > into slave/var/log@backup-20090730-2012 > received 298KB stream in 1 seconds (298KB/sec) > receiving incremental stream of tank/var/crash@backup-20090730-2012 > into slave/var/crash@backup-20090730-2012 > received 312B stream in 1 seconds (312B/sec) > receiving incremental stream of tank/root@backup-20090730-2012 into > slave/root@backup-20090730-2012 > [... panic here ...] > > Unread portion of the kernel message buffer:panic: solaris assert: > ((zp)->z_vnode)->v_usecount > 0, file: /usr/src/sys/modules/ > zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, > line: 920 > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x182 > zfsvfs_teardown() at zfsvfs_teardown+0x24d > zfs_suspend_fs() at zfs_suspend_fs+0x2b > zfs_ioc_recv() at zfs_ioc_recv+0x28b > zfsdev_ioctl() at zfsdev_ioctl+0x8a > devfs_ioctl_f() at devfs_ioctl_f+0x77 > kern_ioctl() at kern_ioctl+0xf6 > ioctl() at ioctl+0xfd > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = > 0x7fffffff8ef8, rbp = 0x7fffffff9c30 --- > KDB: enter: panic > panic: from debugger > > #9 0xffffffff8057eda7 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80609c44 "panic", > msg=0xa
) at cpufunc.h:63 > #11 0xffffffff8033abcb in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b0ec5d > in zfsvfs_teardown () from /boot/kernel/zfs.ko#13 0x0000000000100000 > in ?? () > #14 0xffffff001bff0250 in ?? () > #15 0xffffff001bff0000 in ?? () > #16 0xffffff0008004000 in ?? () > #17 0xffffff803e9747a0 in ?? () > #18 0xffffff803e9747d0 in ?? () > #19 0xffffff803e974770 in ?? () > #20 0xffffff803e974740 in ?? () > #21 0xffffffff80b0ecab in zfs_suspend_fs () from /boot/kernel/zfs.ko > Previous frame inner to this frame (corrupt stack?) > > Unfortunately, I'm not sure I can reproduce this reliably, since it > worked a bunch of times both before and after my previous mail. > > Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this > isn't a new panic because of the changes, perhaps it was triggered > now and never before because of the -DDEBUG? > > Regards, > Thomas I'm able to reliably reproduce this panic, by having zfs recv destroy a filesystem on the receiving end. 1) Use DDEBUG=1, I guess 2) Create a FS on the source pool you don't care about: zfs create -o mountpoint=/testfs source/testfs 3) Clone a pool to another: zfs snapshot -r source@snap && zfs send -R source@snap | zfs recv -Fvd target 4) zfs destroy -r source/testfs 4) zfs snapshot -r source@snap2 && zfs send -R -I snap source@snap2 | zfs recv -Fvd target 5) ^ Panic while receiving the FS the destroyed one is mounted under. In my case, this was tank/root three times out of three; I then tried creating testfs under /tmp (tank/tmp/testfs), *mounting* it under /usr/ testfs, and it panics on receiving tank/usr: attempting destroy slave/tmp/testfs@backup-20090731-1100 success attempting destroy slave/tmp/testfs@backup-20090731-1036 success attempting destroy slave/tmp/testfs success ... receiving incremental stream of tank/tmp@backup-20090731-1101 into slave/tmp@backup-20090731-1101 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/root@backup-20090731-1101 into slave/root@backup-20090731-1101 received 58.3KB stream in 1 seconds (58.3KB/sec) receiving incremental stream of tank/usr@backup-20090731-1101 into slave/usr@backup-20090731-1101 ... panic here, no more output Same backtrace/assert as above. Regards, Thomas