From owner-freebsd-current@FreeBSD.ORG Sun Aug 21 10:36:17 2011 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 637E5106566C for ; Sun, 21 Aug 2011 10:36:17 +0000 (UTC) (envelope-from hans@beastielabs.net) Received: from mail.beastielabs.net (beasties.demon.nl [82.161.3.114]) by mx1.freebsd.org (Postfix) with ESMTP id A408D8FC20 for ; Sun, 21 Aug 2011 10:36:15 +0000 (UTC) Received: from testsoekris.hotsoft.nl (localhost [127.0.0.1]) by mail.beastielabs.net (8.14.4/8.14.4) with ESMTP id p7LA4Q4e028377; Sun, 21 Aug 2011 12:04:26 +0200 (CEST) (envelope-from hans@testsoekris.hotsoft.nl) Received: (from hans@localhost) by testsoekris.hotsoft.nl (8.14.4/8.14.4/Submit) id p7LA4QIw028376; Sun, 21 Aug 2011 12:04:26 +0200 (CEST) (envelope-from hans) Date: Sun, 21 Aug 2011 12:04:26 +0200 From: Hans Ottevanger To: Hugo Silva Message-ID: <20110821100426.GA28260@testsoekris.hotsoft.nl> References: <4E4F71B5.3010606@barafranca.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4E4F71B5.3010606@barafranca.com> User-Agent: Mutt/1.4.2.3i Cc: freebsd-current@freebsd.org Subject: Re: Fwd: Re: Can *you* UFS snapshot a filesystem with 9.0-BETA1? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 21 Aug 2011 10:36:17 -0000 On Sat, Aug 20, 2011 at 09:35:01AM +0100, Hugo Silva wrote: > > > Le Thu, 18 Aug 2011 10:22:31 +0100, > Hugo Silva a ?crit : > > Hello, > > > I'm wondering. On a virtual machine (amd64 HVM+PV), it's crashing > > every time. Not sure if this is SNAFU, as I had never used ufs > > snapshots on freebsd before. > > > > After running mksnap_ffs, ssh stops working (a telnet session doesn't > > show the sshd banner). The ssh session where the command was run from > > stops responding, the webserver dies and xm console'ing from the dom0 > > works, but the VM is unresponsive (ie no login prompt on ENTER). > > > > Anyone else seeing the same? > > I've tried in a FreeBSD guest (9.0-beta1/i386) into VirtualBox and > I see a LOR (or looks like a LOR), then the system is freezed. > This is 100% reproductible. > > Unfortunatly, I'm not able to dump a panic or to break into the > debugger, so a screenshot : > http://user.lamaiziere.net/patrick/public/lormksnap.png > > You should ask on freebsd-current@ > Hi, I can confirm that this happens on "real iron" too. I use an i386 test installation (P4 2.4 GHz, 2GB RAM, 500GB PATA disk), running 9.0-BETA1 as distributed (with a kernel effectively being GENERIC with devices removed that I don't have). When I try to make a snapshot using cd /usr; mksnap_ffs /usr/.snap/testsnap the system is still responsive for a few seconds, with lots of disk activity, but then it prints the following output on the console (using firewire and dcons to ease capturing): lock order reversal: 1st 0xc5a289e8 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:425 2nd 0xdeb3c078 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658 3rd 0xc5663af8 ufs (ufs) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:546 KDB: stack backtrace: db_trace_self_wrapper(c09ec6ba,616e735f,6f687370,3a632e74,a363435,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c07099eb,c09efe14,c5035308,c5039408,c4fda440,...) at kdb_backtrace+0x2a _witness_debugger(c09efe14,c5663af8,c09df984,c5039408,c0a10ba2,...) at _witness_debugger+0x25 witness_checkorder(c5663af8,9,c0a10ba2,222,0,...) at witness_checkorder+0x839 __lockmgr_args(c5663af8,80100,c5663b18,0,0,...) at __lockmgr_args+0x804 ffs_lock(c4fda568,c0bf1250,c59b9c30,80100,c5663aa0,...) at ffs_lock+0x8a VOP_LOCK1_APV(c0a7fb80,c4fda568,c4fda588,c0a8df20,c5663aa0,...) at VOP_LOCK1_APV+0xb5 _vn_lock(c5663aa0,80100,c0a10ba2,222,c5011e80,...) at _vn_lock+0x5e ffs_snapshot(c54f9798,c52dda60,c0a13fb0,1a2,0,...) at ffs_snapshot+0x14cb ffs_mount(c54f9798,c59b0300,ff,394,3,...) at ffs_mount+0x1c13 vfs_donmount(c59b9b80,11100,c50c7c80,c50c7c80,c59ae580,...) at vfs_donmount+0x11e7 nmount(c59b9b80,c4fdacec,c4fdad28,c09ee6dd,0,...) at nmount+0x84 syscallenter(c59b9b80,c4fdace4,c4fdace4,0,c0ab5690,...) at syscallenter+0x263 syscall(c4fdad28) at syscall+0x34 Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280db52b, esp = 0xbfbfe59c, ebp = 0xbfbfed18 --- lock order reversal: 1st 0xdeb3c078 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658 2nd 0xc51a72dc snaplk (snaplk) @ /usr/src/sys/ufs/ffs/ffs_snapshot.c:818 KDB: stack backtrace: db_trace_self_wrapper(c09ec6ba,662f7366,735f7366,7370616e,2e746f68,...) at db_trace_self_wrapper+0x26 kdb_backtrace(c07099eb,c09efdfb,c5035308,c5039b58,c4fda440,...) at kdb_backtrace+0x2a _witness_debugger(c09efdfb,c51a72dc,c0a10c04,c5039b58,c0a10ba2,...) at _witness_debugger+0x25 witness_checkorder(c51a72dc,9,c0a10ba2,332,c5a28a08,...) at witness_checkorder+0x839 __lockmgr_args(c51a72dc,80400,c5a28a08,0,0,...) at __lockmgr_args+0x804 ffs_lock(c4fda568,deb2434c,100000,80400,c5a28990,...) at ffs_lock+0x8a VOP_LOCK1_APV(c0a7fb80,c4fda568,deb243a8,c0a8df20,c5a28990,...) at VOP_LOCK1_APV+0xb5 _vn_lock(c5a28990,80400,c0a10ba2,332,0,...) at _vn_lock+0x5e ffs_snapshot(c54f9798,c52dda60,c0a13fb0,1a2,0,...) at ffs_snapshot+0x295e ffs_mount(c54f9798,c59b0300,ff,394,3,...) at ffs_mount+0x1c13 vfs_donmount(c59b9b80,11100,c50c7c80,c50c7c80,c59ae580,...) at vfs_donmount+0x11e7 nmount(c59b9b80,c4fdacec,c4fdad28,c09ee6dd,0,...) at nmount+0x84 syscallenter(c59b9b80,c4fdace4,c4fdace4,0,c0ab5690,...) at syscallenter+0x263 syscall(c4fdad28) at syscall+0x34 Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (378, FreeBSD ELF32, nmount), eip = 0x280db52b, esp = 0xbfbfe59c, ebp = 0xbfbfed18 --- After this the system is fully unresponsive and requires a hard reset. Once rebooted, the snapshot file appears to exist, but is unusable. When reverting to just softupdates, i.e. disabling journaling on /usr, everything goes well, except that the same LOR's still do occur, though the addresses differ. My amd64 9.0-CURRENT system, just updated to r225055, has the same issue, but since I do not have WITNESS in the kernel config there, the console output is missing. BTW, this issue also makes dump(9) hang the system when the -L option is used. Kind regards, Hans Ottevanger