From owner-freebsd-fs@FreeBSD.ORG  Sun Feb 15 11:08:55 2009
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2D1791065670;
	Sun, 15 Feb 2009 11:08:55 +0000 (UTC) (envelope-from stb@lassitu.de)
Received: from koef.zs64.net (koef.zs64.net [212.12.50.230])
	by mx1.freebsd.org (Postfix) with ESMTP id CA1D68FC1B;
	Sun, 15 Feb 2009 11:08:54 +0000 (UTC) (envelope-from stb@lassitu.de)
Received: from localhost by koef.zs64.net (8.14.3/8.14.3) with ESMTP id
	n1FB8qhw003595
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Sun, 15 Feb 2009 12:08:53 +0100 (CET)
	(envelope-from stb@lassitu.de) (authenticated as stb)
Message-Id: <171C5946-63D1-4AC7-89F7-A951BEF3D1C6@lassitu.de>
From: Stefan Bethke <stb@lassitu.de>
To: freebsd-fs@freebsd.org
In-Reply-To: <3A302EE1-F54D-4415-BC13-CA8ABBA320EC@lassitu.de>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v930.3)
Date: Sun, 15 Feb 2009 12:08:52 +0100
References: <76873DDF-D21B-48AF-9AFB-5A2747BE406B@lassitu.de>
	<3A302EE1-F54D-4415-BC13-CA8ABBA320EC@lassitu.de>
X-Mailer: Apple Mail (2.930.3)
Cc: Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: zfs: using, then destroying a snapshot sometimes panics zfs
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 15 Feb 2009 11:08:55 -0000


Am 15.02.2009 um 11:39 schrieb Stefan Bethke:

> Am 08.02.2009 um 14:37 schrieb Stefan Bethke:
>
>> Sorry I can't be more precise at the moment, but while creating a  
>> script that mirrors some zfs filesystems to another machine, I've  
>> now twice gotten weird behaviour and then a panic.
>>
>> The script iterates over a couple of zfs file systems:
>> - creates a snapshot with zfs snapshot tank/foo@mirror
>> - uses rsync to copy the contents of the snapshot with rsync /tank/ 
>> foo/.zfs/snapshot/mirror/ dest:...
>> - destroys the snapshot with zfs destroy tank/foo@mirror
>>
>> During testing the script, I twice got to a point where, after the  
>> snapshot was created without an error message, rsync dropped out  
>> with an error message similar to "invalid file handle" on /tank/ 
>> foo/.zfs/snapshot.
>>
>> At that point, I could cd to /tank/foo/.zfs, but ls produced the  
>> same error message.
>>
>> I then tried to unmount the snapshot with zfs umount, and got a  
>> panic (which I also didn't manage to capture).
>>
>> Is this a generally known issue, or should I try to capture more  
>> information when this happens again?
>
>
> # cd /tank/foo/.zfs
> # ls -l
> ls: snapshot: Bad file descriptor
> total 0
> # cd snapshot
> -su: cd: snapshot: Not a directory
>
> I currently have no snapshots:
> # zfs list -t snapshot
> no datasets available
>
> However, on a different file system, I can list and cd into snapshot:
> # /tank/bar/.zfs
> # ls -l
> total 0
> dr-xr-xr-x  2 root  wheel  2 Feb  8 00:43 snapshot/
> # cd snapshot
>
> Trying to umount produces a panic:
> # zfs umount /jail/foo
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 01
> fault virtual address	= 0xa8
> fault code		= supervisor write data, page not present
> instruction pointer	= 0x8:0xffffffff802ee565
> stack pointer	        = 0x10:0xfffffffea29c39e0
> frame pointer	        = 0x10:0xfffffffea29c39f0
> code segment		= base 0x0, limit 0xfffff, type 0x1b
> 			= DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags	= interrupt enabled, resume, IOPL = 0
> current process		= 51383 (zfs)
> [thread pid 51383 tid 100298 ]
> Stopped at      _sx_xlock+0x15: lock cmpxchgq   %rsi,0x18(%rdi)
> db> bt
> Tracing pid 51383 tid 100298 td 0xffffff00a598e720
> _sx_xlock() at _sx_xlock+0x15
> zfsctl_umount_snapshots() at zfsctl_umount_snapshots+0xa5
> zfs_umount() at zfs_umount+0xdd
> dounmount() at dounmount+0x2b4
> unmount() at unmount+0x24b
> syscall() at syscall+0x1a5
> Xfast_syscall() at Xfast_syscall+0xab
> --- syscall (22, FreeBSD ELF64, unmount), rip = 0x800f412fc, rsp =  
> 0x7fffffffd1a8, rbp = 0x801202300 ---
> db> call doadump
> Physical memory: 3314 MB
> Dumping 1272 MB: 1257 1241 1225 1209 1193 1177 1161 1145 1129 1113  
> 1097 1081 1065 1049 1033 1017 1001 985 969 953 937 921 905 889 873  
> 857 841 825 809 793 777 761 745 729 713 697 681 665 649 633 617 601  
> 585 569 553 537 521 505 489 473 457 441 425 409 393 377 361 345 329  
> 313 297 281 265 249 233 217 201 185 169 153 137 121 105 89 73 57 41  
> 25 9
> Dump complete
> = 0
>
> I've got the crashdump saved, if there's any information in there  
> that can be helpful.
>
> This is -current from a week ago on amd64.
>
> At the current rate, this happens every couple of days, so gathering  
> more information on the live system probably won't be a problem.

Different machine, identical configuration, I just got this panic on  
reboot:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0xa8
fault code		= supervisor write data, page not present
instruction pointer	= 0x8:0xffffffff802ee3b5
stack pointer	        = 0x10:0xfffffffe40016980
frame pointer	        = 0x10:0xfffffffe40016990
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 1 (init)
[thread pid 1 tid 100002 ]
Stopped at      _sx_xlock+0x15: lock cmpxchgq   %rsi,0x18(%rdi)
db> bt
Tracing pid 1 tid 100002 td 0xffffff000141fab0
_sx_xlock() at _sx_xlock+0x15
zfsctl_umount_snapshots() at zfsctl_umount_snapshots+0xa5
zfs_umount() at zfs_umount+0xdd
dounmount() at dounmount+0x2b4
vfs_unmountall() at vfs_unmountall+0x42
boot() at boot+0x655
reboot() at reboot+0x42
syscall() at syscall+0x1a5
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (55, FreeBSD ELF64, reboot), rip = 0x40897c, rsp =  
0x7fffffffe7b8, rbp = 0x402420 ---


-- 
Stefan Bethke <stb@lassitu.de>   Fon +49 151 14070811