Date: Sun, 15 Feb 2009 11:39:57 +0100 From: Stefan Bethke <stb@lassitu.de> To: freebsd-fs@freebsd.org Cc: Pawel Jakub Dawidek <pjd@freebsd.org> Subject: Re: zfs: using, then destroying a snapshot sometimes panics zfs Message-ID: <3A302EE1-F54D-4415-BC13-CA8ABBA320EC@lassitu.de> In-Reply-To: <76873DDF-D21B-48AF-9AFB-5A2747BE406B@lassitu.de> References: <76873DDF-D21B-48AF-9AFB-5A2747BE406B@lassitu.de>
next in thread | previous in thread | raw e-mail | index | archive | help
Am 08.02.2009 um 14:37 schrieb Stefan Bethke: > Sorry I can't be more precise at the moment, but while creating a > script that mirrors some zfs filesystems to another machine, I've > now twice gotten weird behaviour and then a panic. > > The script iterates over a couple of zfs file systems: > - creates a snapshot with zfs snapshot tank/foo@mirror > - uses rsync to copy the contents of the snapshot with rsync /tank/ > foo/.zfs/snapshot/mirror/ dest:... > - destroys the snapshot with zfs destroy tank/foo@mirror > > During testing the script, I twice got to a point where, after the > snapshot was created without an error message, rsync dropped out > with an error message similar to "invalid file handle" on /tank/ > foo/.zfs/snapshot. > > At that point, I could cd to /tank/foo/.zfs, but ls produced the > same error message. > > I then tried to unmount the snapshot with zfs umount, and got a > panic (which I also didn't manage to capture). > > Is this a generally known issue, or should I try to capture more > information when this happens again? # cd /tank/foo/.zfs # ls -l ls: snapshot: Bad file descriptor total 0 # cd snapshot -su: cd: snapshot: Not a directory I currently have no snapshots: # zfs list -t snapshot no datasets available However, on a different file system, I can list and cd into snapshot: # /tank/bar/.zfs # ls -l total 0 dr-xr-xr-x 2 root wheel 2 Feb 8 00:43 snapshot/ # cd snapshot Trying to umount produces a panic: # zfs umount /jail/foo Fatal trap 12: page fault while in kernel mode cpuid = 1; apic id = 01 fault virtual address = 0xa8 fault code = supervisor write data, page not present instruction pointer = 0x8:0xffffffff802ee565 stack pointer = 0x10:0xfffffffea29c39e0 frame pointer = 0x10:0xfffffffea29c39f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 51383 (zfs) [thread pid 51383 tid 100298 ] Stopped at _sx_xlock+0x15: lock cmpxchgq %rsi,0x18(%rdi) db> bt Tracing pid 51383 tid 100298 td 0xffffff00a598e720 _sx_xlock() at _sx_xlock+0x15 zfsctl_umount_snapshots() at zfsctl_umount_snapshots+0xa5 zfs_umount() at zfs_umount+0xdd dounmount() at dounmount+0x2b4 unmount() at unmount+0x24b syscall() at syscall+0x1a5 Xfast_syscall() at Xfast_syscall+0xab --- syscall (22, FreeBSD ELF64, unmount), rip = 0x800f412fc, rsp = 0x7fffffffd1a8, rbp = 0x801202300 --- db> call doadump Physical memory: 3314 MB Dumping 1272 MB: 1257 1241 1225 1209 1193 1177 1161 1145 1129 1113 1097 1081 1065 1049 1033 1017 1001 985 969 953 937 921 905 889 873 857 841 825 809 793 777 761 745 729 713 697 681 665 649 633 617 601 585 569 553 537 521 505 489 473 457 441 425 409 393 377 361 345 329 313 297 281 265 249 233 217 201 185 169 153 137 121 105 89 73 57 41 25 9 Dump complete = 0 I've got the crashdump saved, if there's any information in there that can be helpful. This is -current from a week ago on amd64. At the current rate, this happens every couple of days, so gathering more information on the live system probably won't be a problem. Stefan -- Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3A302EE1-F54D-4415-BC13-CA8ABBA320EC>