Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 29 Jul 2009 20:04:15 +0200
From:      Thomas Backman <serenity@exscape.org>
To:        Andriy Gapon <avg@FreeBSD.org>
Cc:        freebsd-fs@freebsd.org, FreeBSD current <freebsd-current@freebsd.org>, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject:   Re: zfs: Fatal trap 12: page fault while in kernel mode
Message-ID:  <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org>
In-Reply-To: <4A708455.5070304@freebsd.org>
References:  <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <F4F82B3E-C119-40EF-9AA4-937052876D1E@exscape.org> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <D3491B77-DA5C-4E10-BE1D-D6EF8CFB112E@exscape.org> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 29, 2009, at 19:18, Andriy Gapon wrote:

>
> Thanks a lot again!
>
> Could you please try the following change?
> In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in  
> function
> zfs_inactive() insert the following line:
> 	vrecycle(vp, curthread);
> before the following line:
> 	zfs_znode_free(zp);
>
> This is in "if (zp->z_dbuf == NULL)" branch.
>
> I hope that this should work in concert with the patch that Pawel  
> has posted.
>
> P.S.
> Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/ 
> modules/zfs/Makefile
> should enable additional debugging checks (ASSERTs) in ZFS code.
>
> -- 
> Andriy Gapon
Thanks for your work :)
However, bad news: it didn't help. It *might* have gotten us further,  
though, because the DDB backtrace now looks like this:

_sx_xlock_hard()
_sx_xlock()
zfs_znode_free()
zfs_freebsd_inactive()
VOP_INACTIVE_APV()
vinactive()
vput()
dounmount()
unmount()
syscall()
XFast_syscall()

KGDB:

Unread portion of the kernel message buffer:
kernel trap 9 with interrupts disabled


Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer     = 0x20:0xffffffff80342b99
stack pointer           = 0x28:0xffffff803e9b7910
frame pointer           = 0x28:0xffffff803e9b7970
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 1398 (zpool)
panic: from debugger
cpuid = 0
KDB: stack backtrace:
Uptime: 1m28s
Physical memory: 2030 MB
Dumping 1405 MB: ...
Reading symbols: ...

#9  0xffffffff805986aa in trap (frame=0xffffff803e9b7860) at /usr/src/ 
sys/amd64/amd64/trap.c:639
#10 0xffffffff8057dfe7 in calltrap () at /usr/src/sys/amd64/amd64/ 
exception.S:224
#11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff0071019181,  
tid=18446742976093954048, opts=Variable "opts" is not available.
)
     at /usr/src/sys/kern/kern_sx.c:575
#12 0xffffffff8034350e in _sx_xlock (sx=Variable "sx" is not available.
) at sx.h:155
#13 0xffffffff80ad6be7 in zfs_znode_free () from /boot/kernel/zfs.ko
#14 0xffffffff80b5af20 in ?? ()
#15 0xffffff803e9b79f0 in ?? ()
#16 0xffffff0071032000 in ?? ()
#17 0xffffff803e9b79c0 in ?? ()
#18 0xffffffff80af719a in zfs_freebsd_inactive () from /boot/kernel/ 
zfs.ko
#19 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffff0071101a48,  
a=0xffffff0071019181) at vnode_if.c:1863
#20 0xffffffff803c6aaa in vinactive (vp=0xffffff0071290938,  
td=0xffffff0071019001) at vnode_if.h:807
#21 0xffffffff803cbf26 in vput (vp=0xffffff0071290938) at /usr/src/sys/ 
kern/vfs_subr.c:2257
#22 0xffffffff803c57ef in dounmount (mp=0xffffff0002b9e8d0, flags=0,  
td=Variable "td" is not available.
) at /usr/src/sys/kern/vfs_mount.c:1333
#23 0xffffffff803c5df8 in unmount (td=0xffffff0071032000,  
uap=0xffffff803e9b7bf0)
     at /usr/src/sys/kern/vfs_mount.c:1174
#24 0xffffffff805980bf in syscall (frame=0xffffff803e9b7c80) at /usr/ 
src/sys/amd64/amd64/trap.c:984
#25 0xffffffff8057e2c1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/ 
exception.S:373

(kgdb) fr 22
#22 0xffffffff803c57ef in dounmount (mp=0xffffff0002b9e8d0, flags=0,  
td=Variable "td" is not available.
) at /usr/src/sys/kern/vfs_mount.c:1333
1333                    vput(coveredvp);
(kgdb) p *mp
$1 = {mnt_mtx = {lock_object = {lo_name = 0xffffffff80611acd "struct  
mount mtx", lo_flags = 16973824, lo_data = 0,
       lo_witness = 0x0}, mtx_lock = 4}, mnt_gen = 2, mnt_list =  
{tqe_next = 0x0, tqe_prev = 0xffffff0002c71be8},
   mnt_op = 0xffffffff80b5ae80, mnt_vfc = 0xffffffff80b5ae20,  
mnt_vnodecovered = 0xffffff0071290938, mnt_syncer = 0x0,
   mnt_ref = 0, mnt_nvnodelist = {tqh_first = 0x0, tqh_last =  
0xffffff0002b9e930}, mnt_nvnodelistsize = 0,
   mnt_writeopcount = 0, mnt_kern_flag = 1627390088, mnt_flag = 4096,  
mnt_xflag = 0, mnt_noasync = 0,
   mnt_opt = 0xffffff0002f666f0, mnt_optnew = 0x0, mnt_maxsymlinklen =  
0, mnt_stat = {f_version = 537068824,
     f_type = 4, f_flags = 4096, f_bsize = 131072, f_iosize = 131072,  
f_blocks = 486, f_bfree = 328, f_bavail = 328,
     f_files = 334, f_ffree = 328, f_syncwrites = 0, f_asyncwrites =  
0, f_syncreads = 0, f_asyncreads = 0, f_spare = {
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0,  
f_fsid = {val = {1968303680, -171280380}},
     f_charspare = '\0' <repeats 79 times>, f_fstypename = "zfs", '\0'  
<repeats 12 times>,
     f_mntfromname = "crashtestslave/test_orig", '\0' <repeats 63  
times>,
     f_mntonname = "/crashtestslave/crashtestslave/test_orig", '\0'  
<repeats 47 times>},
   mnt_cred = 0xffffff0002f0b700, mnt_data = 0xffffff002489e000,  
mnt_time = 0, mnt_iosize_max = 65536,
   mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 1597825977,  
mnt_lockref = 0, mnt_secondary_writes = 0,
   mnt_secondary_accwrites = 0, mnt_susp_owner = 0x0, mnt_gjprovider =  
0x0, mnt_explock = {lock_object = {
       lo_name = 0xffffffff80611ade "explock", lo_flags = 91422720,  
lo_data = 0, lo_witness = 0x0}, lk_lock = 1,
     lk_timo = 0, lk_pri = 80}}

Worth noting above: it's NOT the "pool root FS" that's being unmounted  
here. The panic can also be triggered on "zfs unmount crashtestslave/ 
test_orig" (i.e. not the root FS which was the only that panicked with  
zfs unmount, as opposed to zpool export, before).

(kgdb) fr 21
#21 0xffffffff803cbf26 in vput (vp=0xffffff0071290938) at /usr/src/sys/ 
kern/vfs_subr.c:2257
2257                    vinactive(vp, td);
(kgdb) p *vp
$3 = {v_type = VBAD, v_tag = 0xffffffff80600ff6 "none", v_op =  
0xffffffff80779700, v_data = 0x0, v_mount = 0x0,
   v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0xffffff0071290b38},  
v_un = {vu_mount = 0x0, vu_socket = 0x0,
     vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist =  
{le_next = 0x0, le_prev = 0x0}, v_hash = 0,
   v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0,  
tqh_last = 0xffffff0071290998}, v_cache_dd = 0x0,
   v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_lock =  
{lock_object = {lo_name = 0xffffffff80b56367 "zfs",
       lo_flags = 91947008, lo_data = 0, lo_witness = 0x0}, lk_lock =  
18446742976093954048, lk_timo = 51,
     lk_pri = 80}, v_interlock = {lock_object = {lo_name =  
0xffffffff806126d9 "vnode interlock", lo_flags = 16973824,
       lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock =  
0xffffff00712909d0, v_holdcnt = 1, v_usecount = 0,
   v_iflag = 2176, v_vflag = 0, v_writecount = 0, v_freelist =  
{tqe_next = 0x0, tqe_prev = 0xffffff0002cecc18},
   v_bufobj = {bo_mtx = {lock_object = {lo_name = 0xffffffff806126e9  
"bufobj interlock", lo_flags = 16973824,
         lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, bo_clean =  
{bv_hd = {tqh_first = 0x0,
         tqh_last = 0xffffff0071290a70}, bv_root = 0x0, bv_cnt = 0},  
bo_dirty = {bv_hd = {tqh_first = 0x0,
         tqh_last = 0xffffff0071290a90}, bv_root = 0x0, bv_cnt = 0},  
bo_numoutput = 0, bo_flag = 0,
     bo_ops = 0xffffffff8079afa0, bo_bsize = 131072, bo_object = 0x0,  
bo_synclist = {le_next = 0x0, le_prev = 0x0},
     bo_private = 0xffffff0071290938, __bo_vnode =  
0xffffff0071290938}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0}

(kgdb) fr 11
#11 0xffffffff80342b99 in _sx_xlock_hard (sx=0xffffff0071019181,  
tid=18446742976093954048, opts=Variable "opts" is not available.
)
     at /usr/src/sys/kern/kern_sx.c:575
575                             owner = (struct thread *)SX_OWNER(x);
(kgdb) p *sx
$4 = {lock_object = {lo_name = 0xffffffff80b571 <Address  
0xffffffff80b571 out of bounds>, lo_flags = 160000,
     lo_data = 0, lo_witness = 0x100000000000000}, sx_lock =  
16717361816799281152}

Regards,
Thomas



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3>