Date: Tue, 15 Oct 2013 13:18:58 +0100 (BST) From: Anton Shterenlikht <mexas@bris.ac.uk> To: davide@freebsd.org, mexas@bris.ac.uk Cc: freebsd-current@freebsd.org, freebsd-ia64@freebsd.org Subject: Re: panic: wrong page state m 0xe00000027a9adb40 + savecore deadlock Message-ID: <201310151218.r9FCIwBx043808@mech-cluster241.men.bris.ac.uk> In-Reply-To: <CACYV=-GE%2BSUR_RrXfhaH9FekQ3QC6DuYuSpcdhAok0kH0uBShQ@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
>From davide.italiano@gmail.com Tue Oct 15 11:30:07 2013 > >On Tue, Oct 15, 2013 at 10:43 AM, Anton Shterenlikht <mexas@bris.ac.uk> wrote: > >> Anyway, savecore eventually deadlocks: >> >> panic: deadlkres: possible deadlock detected for 0xe0000000127b7b00, blocked for 901401 ticks >> > >[trim] > >> >> Tracing command savecore pid 805 tid 100079 td 0xe0000000127b7b00 >> cpu_switch(0xe0000000127b7b00, 0xe000000011178900, 0xe000000012402fc0, 0x9ffc0000005e7e80) at cpu_switch+0xd0 >> sched_switch(0xe0000000127b7b00, 0xe000000011178900, 0x9ffc000000f15698, 0x9ffc000000f15680) at sched_switch+0x890 >> mi_switch(0x103, 0x0, 0xe0000000127b7b00, 0x9ffc00000062d1f0) at mi_switch+0x3f0 >> turnstile_wait(0xe000000012402fc0, 0xe000000012400480, 0x0, 0x9ffc000000dcb698) at turnstile_wait+0x960 >> __mtx_lock_sleep(0x9ffc0000010f9998, 0xe0000000127b7b00, 0xe000000012402fc0, 0x9ffc000000dc0558, 0x742) at __mtx_lock_sleep+0x2f0 >> __mtx_lock_flags(0x9ffc0000010f9980, 0x0, 0x9ffc000000dd4a90, 0x742) at __mtx_lock_flags+0x1e0 >> vfs_vmio_release(0xa00000009ebe72f0, 0xe00000027ed2ab70, 0x3, 0xa00000009ebe736c, 0xa00000009ebe7498, 0xa00000009ebe72f8, 0x9ffc000000dd4a90, 0x9ffc0000010f9680) at vfs_vmio_release+0x290 >> getnewbuf(0xe0000000127f4ec0, 0x0, 0x0, 0x8000, 0xa00000009ebe99a8, 0x0, 0x9ffc0000010f0798, 0xa00000009ebe72f0) at getnewbuf+0x7e0 >> getblk(0xe0000000127f4ec0, 0x4cbaa, 0x8000, 0x0, 0x0, 0x0, 0x0, 0x0) at getblk+0xee0 >> ffs_balloc_ufs2(0xe0000000127f4ec0, 0x4cbaa, 0xa0000000c60ba000, 0xe000000011165a00, 0x7f050000, 0xa00000009dd79160) at ffs_balloc_ufs2+0x2950 >> ffs_write(0xa00000009dd79248, 0x3000, 0x265d50000) at ffs_write+0x5c0 >> VOP_WRITE_APV(0x9ffc000000e94ac0, 0xa00000009dd79248, 0x0, 0x0) at VOP_WRITE_APV+0x330 >> vn_write(0xe0000000129ae820, 0xa00000009dd79360, 0xe000000011165a00, 0x0, 0xe0000000129ae830, 0xe0000000127f4ec0) at vn_write+0x450 >> vn_io_fault(0xe0000000129ae820, 0xa00000009dd79360, 0xe000000011165a00, 0x0, 0xe0000000127b7b00) at vn_io_fault+0x330 >> dofilewrite(0xe0000000127b7b00, 0x7, 0xe0000000129ae820, 0xa00000009dd79360, 0xffffffffffffffff, 0x0) at dofilewrite+0x180 >> kern_writev(0xe0000000127b7b00, 0x7, 0xa00000009dd79360) at kern_writev+0xa0 >> sys_write(0xe0000000127b7b00, 0xa00000009dd794e8, 0x9ffc000000abac80, 0x48d) at sys_write+0x100 >> syscall(0xe0000000129d04a0, 0x140857000, 0x8000, 0xe0000000127b7b00, 0x0, 0x0, 0x9ffc000000ab7280, 0x8) at syscall+0x5e0 >> --More-- > >I'm not commenting on the first panic you got -- but on the deadlock >reported by DEADLKRES. I think that's the vm_page lock. >You can run kgdb /boot/${KERNEL}/kernel where ${KERNEL} is the incrimined one >then l *vfs_vmio_release+0x290 >to get the exact point where it fails. Like this? # kgdb /boot/kernel/kernel GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "ia64-marcel-freebsd"... (kgdb) l *vfs_vmio_release+0x290 0x9ffc0000006b8830 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1859). 1854 /* 1855 * In order to keep page LRU ordering consistent, put 1856 * everything on the inactive queue. 1857 */ 1858 vm_page_lock(m); 1859 vm_page_unwire(m, 0); 1860 1861 /* 1862 * Might as well free the page if we can and it has 1863 * no valid data. We also free the page if the (kgdb) >I'm unsure here because 'show alllocks' and 'show locks' outputs are >empty -- are you building your kernel with WITNESS etc..? I think so: # Debugging support. Always need this: options KDB # Enable kernel debugger support. options KDB_TRACE # Print a stack trace for a panic. # For full debugger support use (turn off in stable branch): options DDB # Support DDB options GDB # Support remote GDB options DEADLKRES # Enable the deadlock resolver options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed options MALLOC_DEBUG_MAXZONES=8 # Separate malloc(9) zones # textdump(4) options TEXTDUMP_PREFERRED options TEXTDUMP_VERBOSE # http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-deadlocks.html options DEBUG_LOCKS options DEBUG_VFS_LOCKS options DIAGNOSTIC Also, does this look right: $ sysctl -a | grep kdb debug.ddb.scripting.scripts: kdb.enter.panic=textdump set; capture on; run lockinfo; show pcpu; bt; ps; alltrace; capture off; call doadump; reset kdb.enter.witness=run lockinfo debug.kassert.do_kdb: 0 debug.kdb.alt_break_to_debugger: 0 debug.kdb.break_to_debugger: 0 debug.kdb.trap_code: 0 debug.kdb.trap: 0 debug.kdb.panic: 0 debug.kdb.enter: 0 debug.kdb.current: ddb debug.kdb.available: ddb gdb debug.witness.kdb: 0 $ Thank you Anton
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201310151218.r9FCIwBx043808>