Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 9 Aug 2008 11:38:43 +0300
From:      "Dimitar Vasilev" <dimitar.vassilev@gmail.com>
To:        freebsd-fs@freebsd.org
Cc:        pjd@freebsd.org
Subject:   zfs snapshot panic problem
Message-ID:  <59adc1a0808090138t7ab9913bmc08d42fb56801e0d@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi all,
I'm having a problem with a 7-stable SMP amd64 machine running zfs snapshots
for backups.
It starts to complain about bad file descriptors after running 8 days
without a problem.
Then we try to unmount the problem fs and we got a system crash.
Kernel config

ident           FOO
include         GENERIC
nooptions       SCHED_4BSD
options         SCHED_ULE
options         GEOM_JOURNAL
options         ALTQ
options         ALTQ_CBQ
options         ALTQ_RED
options         ALTQ_RIO
options         ALTQ_HFSC
options         ALTQ_PRIQ
options         ALTQ_NOPCC
options         DEVICE_POLLING
options         ZERO_COPY_SOCKETS
options         HZ=2000
#
device          pf
device          pflog                   #logging support interface for PF
device          pfsync                  #synchronization interface for PF
device          carp                    #Common Address Redundancy Protocol

Here is output of backtrace:

kgdb -c /var/crash/vmcore.7 /boot/kernel/kernel
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xc0
fault code              = supervisor write data, page not present
instruction pointer     = 0x8:0xffffffff804c2515
stack pointer           = 0x10:0xffffffffd77e4980
frame pointer           = 0x10:0xffffffffd77e4a20
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 19469 (zfs)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 8d12h7m45s
Physical memory: 2034 MB
Dumping 1417 MB: 1402 1386 1370 1354 1338 1322 1306 1290 1274 1258 1242 1226
1210 1194 1178 1162 1146 1130 1114 1098 1082 1066 1050 1034 1018 1002 986
970 954 938 922 906 890 874 858 842 826 810 794 778 762 746 730 714 698 682
666 650 634 618 602 586 570 554 538 522 506 490 474 458 442 426 410 394 378
362 346 330 314 298 282 266 250 234 218 202 186 170 154 138 122 106 90 74 58
42 26 10

Reading symbols from /boot/kernel/zfs.ko...Reading symbols from
/boot/kernel/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/geom_journal.ko...Reading symbols from
/boot/kernel/geom_journal.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/geom_journal.ko
Reading symbols from /boot/kernel/fdescfs.ko...Reading symbols from
/boot/kernel/fdescfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/fdescfs.ko
Reading symbols from /boot/kernel/pflog.ko...Reading symbols from
/boot/kernel/pflog.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/pflog.ko
Reading symbols from /boot/kernel/pf.ko...Reading symbols from
/boot/kernel/pf.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/pf.ko
Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from
/boot/kernel/accf_http.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/accf_http.ko
#0  doadump () at pcpu.h:194
194     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:194
#1  0x0000000000000004 in ?? ()
#2  0xffffffff804ba839 in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:418
#3  0xffffffff804bac3d in panic (fmt=0x104 <Address 0x104 out of bounds>) at
/usr/src/sys/kern/kern_shutdown.c:572
#4  0xffffffff807871c4 in trap_fatal (frame=0xffffff00015c7360,
eva=18446742974224558304)
    at /usr/src/sys/amd64/amd64/trap.c:724
#5  0xffffffff80787595 in trap_pfault (frame=0xffffffffd77e48d0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:641
#6  0xffffffff80787ed8 in trap (frame=0xffffffffd77e48d0) at
/usr/src/sys/amd64/amd64/trap.c:410
#7  0xffffffff8076d8ae in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:169
#8  0xffffffff804c2515 in _sx_xlock (sx=0xa0, opts=0,
    file=0xffffffff80cb88e0
"/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c",
    line=1069) at atomic.h:142
#9  0xffffffff80c9fb3a in zfsctl_umount_snapshots (vfsp=Variable "vfsp" is
not available.
)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ctldir.c:1069
#10 0xffffffff80ca6988 in zfs_umount (vfsp=0xffffff0001560a68, fflag=0,
td=0xffffff00015c7360)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:692
#11 0xffffffff80533dbe in dounmount (mp=0xffffff0001560a68, flags=0,
td=0xffffff00015c7360)
    at /usr/src/sys/kern/vfs_mount.c:1286
#12 0xffffffff8053458e in unmount (td=0xffffff00015c7360,
uap=0xffffffffd77e4be0) at /usr/src/sys/kern/vfs_mount.c:1182
#13 0xffffffff80787817 in syscall (frame=0xffffffffd77e4c70) at
/usr/src/sys/amd64/amd64/trap.c:852
#14 0xffffffff8076dabb in Xfast_syscall () at
/usr/src/sys/amd64/amd64/exception.S:290
#15 0x0000000800f1514c in ?? ()

It's the same machine mentioned in:

http://lists.freebsd.org/pipermail/freebsd-fs/2008-February/004377.html
http://lists.freebsd.org/pipermail/freebsd-fs/2008-February/004418.html

Any ideas how to fix?
Can compile kernel with debug if needed.
Best regards,
Dimitar Vassilev



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?59adc1a0808090138t7ab9913bmc08d42fb56801e0d>