Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 27 Nov 2012 11:25:18 -0700
From:      Josh Beard <josh@signalboxes.net>
To:        freebsd-fs@freebsd.org
Subject:   ZFS: Panic when attempting to delete certain data
Message-ID:  <CAHDrHStcfSJ-9ueSV%2BFujEsmAK3zMX2CAGVD6Xz_2gJAThu5Kg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

I have a system that I can consistently reproduce a panic on when trying to
delete certain data.  The data is data that was rsynced from another system
- nothing terribly unique.  This has been ongoing from several months,
starting with 9.0-RELEASE and now running 9.1-RC3.

I can't find anything in common with the files that I can trigger the
panics with.  One is a simple gzipped archive where some are plain text.
 Strangely, I can only reproduce it with data that was rsynced from that
particular system (which is a Mac).

I seriously doubt it's hardware at this point, as virtually every piece of
hardware in that system has been replaced (including motherboard and
drives).  That said, the zpools were rebuilt from scratch when the drives
were replaced and the issue persists.

I can't seem to trigger it with other actions, such as chmod, chown, or
even mv.  Simply attempting to unlink the files seems to do it.

# uname -a (I can reproduce on a GENERIC kernel, too).
FreeBSD bksys1 9.1-RC3 FreeBSD 9.1-RC3 #0 r242591: Sun Nov  4 19:17:25 MST
2012     root@bksys1:/usr/obj/usr/src/sys/BKSYS191  amd64

zpool version is 28; zfs version is 5.

/boot/loader.conf  doesn't have anything related in it, and an empty one
produces the same results.

zpool scrubs are done weekly and have returned no errors (most recent was 3
days ago).

Any insight is very appreciated!

Josh


The message:
Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 05
fault virtual address = 0x160
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80ebd45a
stack pointer        = 0x28:0xffffff8466534850
frame pointer        = 0x28:0xffffff8466534910
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 3245 (rm)
trap number = 12
panic: page fault
cpuid = 3
KDB: stack backtrace:
#0 0xffffffff80585c28 at kdb_backtrace+0x68
#1 0xffffffff805502cb at panic+0x21b
#2 0xffffffff807a9fad at trap_fatal+0x39d
#3 0xffffffff807aa0f0 at trap_pfault+0x120
#4 0xffffffff807aa7e9 at trap+0x3d9
#5 0xffffffff80794f4f at calltrap+0x8
#6 0xffffffff8081cf13 at VOP_REMOVE_APV+0x53
#7 0xffffffff805ed355 at kern_unlinkat+0x265
#8 0xffffffff805ed419 at kern_unlink+0x19
#9 0xffffffff805ed431 at sys_unlink+0x11
#10 0xffffffff807a95bd at amd64_syscall+0x2fd
#11 0xffffffff80795237 at Xfast_syscall+0xf7
Uptime: 14m42s
Dumping 2432 out of 16361
MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%

Reading symbols from /boot/kernel/coretemp.ko...Reading symbols from
/boot/kernel/coretemp.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/coretemp.ko
Reading symbols from /boot/kernel/zfs.ko...Reading symbols from
/boot/kernel/zfs.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/zfs.ko
Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from
/boot/kernel/opensolaris.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/opensolaris.ko
Reading symbols from /boot/kernel/if_lagg.ko...Reading symbols from
/boot/kernel/if_lagg.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/if_lagg.ko
Reading symbols from /boot/kernel/ng_ubt.ko...Reading symbols from
/boot/kernel/ng_ubt.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_ubt.ko
Reading symbols from /boot/kernel/ng_hci.ko...Reading symbols from
/boot/kernel/ng_hci.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_hci.ko
Reading symbols from /boot/kernel/ng_bluetooth.ko...Reading symbols from
/boot/kernel/ng_bluetooth.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_bluetooth.ko
Reading symbols from /boot/kernel/netgraph.ko...Reading symbols from
/boot/kernel/netgraph.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/netgraph.ko
Reading symbols from /boot/kernel/ng_l2cap.ko...Reading symbols from
/boot/kernel/ng_l2cap.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_l2cap.ko
Reading symbols from /boot/kernel/ng_btsocket.ko...Reading symbols from
/boot/kernel/ng_btsocket.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_btsocket.ko
Reading symbols from /boot/kernel/ng_socket.ko...Reading symbols from
/boot/kernel/ng_socket.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/ng_socket.ko
Reading symbols from /boot/kernel/blank_saver.ko...Reading symbols from
/boot/kernel/blank_saver.ko.symbols...done.
done.
Loaded symbols for /boot/kernel/blank_saver.ko
#0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
224 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump (textdump=Variable "textdump" is not available.
) at pcpu.h:224
#1  0xffffffff8054ff87 in kern_reboot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:448
#2  0xffffffff8055030f in panic (fmt=Variable "fmt" is not available.
)
    at /usr/src/sys/kern/kern_shutdown.c:636
#3  0xffffffff807a9fad in trap_fatal (frame=0xffffff84665347a0, eva=352)
    at /usr/src/sys/amd64/amd64/trap.c:857
#4  0xffffffff807aa0f0 in trap_pfault (frame=0xffffff84665347a0, usermode=0)
    at /usr/src/sys/amd64/amd64/trap.c:714
#5  0xffffffff807aa7e9 in trap (frame=0xffffff84665347a0)
    at /usr/src/sys/amd64/amd64/trap.c:456
#6  0xffffffff80794f4f in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:228
#7  0xffffffff80ebd45a in zfs_freebsd_remove (ap=Variable "ap" is not
available.
)
    at
/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1855
#8  0xffffffff8081cf13 in VOP_REMOVE_APV (vop=Variable "vop" is not
available.
) at vnode_if.c:1333
#9  0xffffffff805ed355 in kern_unlinkat (td=0xfffffe000c4b1000, fd=-100,
    path=0x7fffffffdb2e <Address 0x7fffffffdb2e out of bounds>,
    pathseg=UIO_USERSPACE, oldinum=0) at vnode_if.h:575
#10 0xffffffff805ed419 in kern_unlink (td=Variable "td" is not available.
)
    at /usr/src/sys/kern/vfs_syscalls.c:1897
#11 0xffffffff805ed431 in sys_unlink (td=Variable "td" is not available.
)
    at /usr/src/sys/kern/vfs_syscalls.c:1867
#12 0xffffffff807a95bd in amd64_syscall (td=0xfffffe000c4b1000, traced=0)
    at subr_syscall.c:135
#13 0xffffffff80795237 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:387
#14 0x00000008009100bc in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb)

---
atapci0: <Marvell 88SE6145 UDMA133 controller> port
0x2018-0x201f,0x2024-0x2027,0x2010-0x2017,0x2020-0x2023,0x2000-0x200f mem
0xf6100000-0xf61003ff irq 19 at device 0.0 on pci3
ahci0: <Marvell 88SE6145 AHCI SATA controller> at channel -1 on atapci0
ahci0: AHCI v1.00 with 4 3Gbps ports, Port Multiplier supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ata2: <ATA channel> at channel 0 on atapci0
ahci1: <Intel 5 Series/3400 Series AHCI SATA controller> port
0x4068-0x406f,0x4074-0x4077,0x4060-0x4067,0x4070-0x4073,0x4020-0x403f mem
0xf6325000-0xf63257ff irq 19 at device 31.2 on pci0
ahci1: AHCI v1.30 with 6 3Gbps ports, Port Multiplier not supported
ada0 at ahcich1 bus 0 scbus1 target 0 lun 0
ada0: <WDC WD5000AAKS-22YGA0 12.01C02> ATA-8 SATA 2.x device
ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada0: 476940MB (976773168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad6
ada1 at ahcich4 bus 0 scbus5 target 0 lun 0
ada1: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada1: Previously was known as ad14
ada2 at ahcich5 bus 0 scbus6 target 0 lun 0
ada2: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad16
ada3 at ahcich6 bus 0 scbus7 target 0 lun 0
ada3: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada3: Previously was known as ad18
ada4 at ahcich7 bus 0 scbus8 target 0 lun 0
ada4: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada4: Previously was known as ad20
ada5 at ahcich8 bus 0 scbus9 target 0 lun 0
ada5: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada5: Previously was known as ad22
ada6 at ahcich9 bus 0 scbus10 target 0 lun 0
ada6: <WDC WD10EZEX-00RKKA0 80.00A80> ATA-8 SATA 3.x device
ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada6: Command Queueing enabled
ada6: 953869MB (1953525168 512 byte sectors: 16H 63S/T 16383C)
ada6: Previously was known as ad24



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAHDrHStcfSJ-9ueSV%2BFujEsmAK3zMX2CAGVD6Xz_2gJAThu5Kg>