Date: Sun, 12 Nov 2006 17:10:20 GMT From: Ruslan Ermilov <ru@FreeBSD.org> To: freebsd-amd64@FreeBSD.org Subject: Re: amd64/105437: 6.2-BETA3 crashes on amd64 Message-ID: <200611121710.kACHAKSJ081003@freefall.freebsd.org>
next in thread | raw e-mail | index | archive | help
The following reply was made to PR amd64/105437; it has been noted by GNATS. From: Ruslan Ermilov <ru@FreeBSD.org> To: Wojciech Puchar <wojtek@3miasto.net> Cc: bug-followup@FreeBSD.org Subject: Re: amd64/105437: 6.2-BETA3 crashes on amd64 Date: Sun, 12 Nov 2006 20:07:12 +0300 On Sun, Nov 12, 2006 at 05:09:22PM +0100, Wojciech Puchar wrote: > #0 doadump () at pcpu.h:172 > #1 0x0000000000000004 in ?? () > #2 0xffffffff8025deb3 in boot (howto=260) at > ../../../kern/kern_shutdown.c:409 > #3 0xffffffff8025e4b6 in panic (fmt=0xffffff003d8fa980 "°\226\217=") > at ../../../kern/kern_shutdown.c:565 > #4 0xffffffff803e87f2 in trap_fatal (frame=0xffffff003d8fa980, > eva=18446742975230744240) > at ../../../amd64/amd64/trap.c:660 > #5 0xffffffff803e8d16 in trap (frame= > {tf_rdi = -1098993325056, tf_rsi = 4, tf_rdx = -1098478802560, > tf_rcx = 4, tf_r8 = -1098478802496, tf_r9 = -1098993325056, tf_rax = 2, > tf_rbx = -1098478802560, tf_rbp = 4, tf_r10 = -1098993325056, tf_r11 = > -1264970144, tf_r12 = -1098478802560, tf_r13 = -1098993325056, tf_r14 = > -2141357264, tf_r15 = -1098758394592, tf_trapno = 12, tf_addr = 212, > tf_flags = -2144054761, tf_err = 0, tf_rip = -2144839500, tf_cs = 8, > tf_rflags = 65543, tf_rsp = -1264969928, tf_ss = 16}) > at ../../../amd64/amd64/trap.c:238 > #6 0xffffffff803d640b in calltrap () at > ../../../amd64/amd64/exception.S:168 > #7 0xffffffff802858b4 in turnstile_setowner (ts=0xffffff001ee4ac00, > owner=0x4) > at ../../../kern/subr_turnstile.c:432 > #8 0xffffffff80285ebb in turnstile_wait (lock=0xffffff002ce56d20, > owner=0x4) > at ../../../kern/subr_turnstile.c:591 > #9 0xffffffff80252f39 in _mtx_lock_sleep (m=0xffffff002ce56d20, > tid=18446742975230749056, > opts=1032825216, file=0x4 <Address 0x4 out of bounds>, > line=1032825280) > at ../../../kern/kern_mutex.c:579 > The line 579 has: : turnstile_wait(&m->mtx_object, mtx_owner(m)); Some references: : /* : * Internal utility macros. : */ : #define mtx_unowned(m) ((m)->mtx_lock == MTX_UNOWNED) : : #define mtx_owner(m) (mtx_unowned((m)) ? NULL \ : : (struct thread *)((m)->mtx_lock & MTX_FLAGMASK)) : /* : * State bits kept in mutex->mtx_lock, for the DEFAULT lock type. None of this, : * with the exception of MTX_UNOWNED, applies to spin locks. : */ : #define MTX_RECURSED 0x00000001 /* lock recursed (for MTX_DEF only) */ : #define MTX_CONTESTED 0x00000002 /* lock contested (for MTX_DEF only) */ : #define MTX_UNOWNED 0x00000004 /* Cookie for free mutex */ : #define MTX_FLAGMASK ~(MTX_RECURSED | MTX_CONTESTED) mtx_owner(m) returns the value of "4", which is MUTEX_UNOWNED, but if mtx_lock were only MTX_UNOWNED, mtx_unowned() would return true, and mtx_owner() would return NULL. This means that mtx_lock has something other than MTX_UNOWNED as well, which is illegal. Most likely, it's MTX_DESTROYED (which is defined as (MTX_CONTESTED \ | MTX_UNOWNED)). You should print the mutex it to be sure. So it looks like the code is trying to pass a corrupt mutex. Please recompile your kernel with the following options: options INVARIANTS # Enable calls of extra sanity checking options INVARIANT_SUPPORT # Extra sanity checks of internal structures, required by INVARIANTS options WITNESS # Enable checks to detect deadlocks and cycles options WITNESS_SKIPSPIN # Don't run witness on spinlocks for speed It will run more slowly, but could allow to catch the bug earlier. It could turn out to be a problem with the IPv6 routing code. > #10 0xffffffff8033c7ab in nd6_output (ifp=0xffffff003063c000, > origifp=0xffffff003063c000, > m0=0xffffff0001cd6400, dst=0xffffff002e437a60, rt0=0xffffff002b96f630) > at ../../../netinet6/nd6.c:2004 > #11 0xffffffff80338c12 in ip6_output (m0=0x100010170400120, opt=0x500, > ro=0xffffffffb49a1a00, > flags=0, im6o=0x0, ifpp=0x0, inp=0xffffff0001c304c0) at > ../../../netinet6/ip6_output.c:994 > I don't understand why "ro" is not NULL here, because tcp_output() below calls it with a NULL argument; this is probably due to a -O2 compilation. > #12 0xffffffff80315a6d in tcp_output (tp=0xffffff0010b165e0) at > ../../../netinet/tcp_output.c:1059 > #13 0xffffffff8031c6a5 in tcp_timer_rexmt (xtp=0xffffff001ee4ac00) > at ../../../netinet/tcp_timer.c:537 > #14 0xffffffff8026d02a in softclock (dummy=0xffffff001ee4ac00) at > ../../../kern/kern_timeout.c:290 > #15 0xffffffff802442b6 in ithread_loop (arg=0xffffff00000053c0) at > ../../../kern/kern_intr.c:682 > #16 0xffffffff80242d03 in fork_exit (callout=0xffffffff80244170 > <ithread_loop>, > arg=0xffffff00000053c0, frame=0xffffffffb49a1c50) at > ../../../kern/kern_fork.c:821 > #17 0xffffffff803d676e in fork_trampoline () at > ../../../amd64/amd64/exception.S:394 > #18 0x0000000000000000 in ?? () > #19 0x0000000000000000 in ?? () > #20 0x0000000000000001 in ?? () > #21 0x0000000000000000 in ?? () > #22 0x0000000000000000 in ?? () > #23 0x0000000000000000 in ?? () > #24 0x0000000000000000 in ?? () > #25 0x0000000000000000 in ?? () > #26 0x0000000000000000 in ?? () > #27 0x0000000000000000 in ?? () > #28 0x0000000000000000 in ?? () > #29 0x0000000000000000 in ?? () > #30 0x0000000000000000 in ?? () > #31 0x0000000000000000 in ?? () > #32 0x0000000000000000 in ?? () > #33 0x0000000000000000 in ?? () > #34 0x0000000000000000 in ?? () > #35 0x0000000000000000 in ?? () > #36 0x0000000000000000 in ?? () > #37 0x0000000000000000 in ?? () > #38 0x0000000000000000 in ?? () > #39 0x0000000000000000 in ?? () > #40 0x0000000000000000 in ?? () > #41 0x0000000000000000 in ?? () > #42 0x0000000000000000 in ?? () > #43 0x0000000000000000 in ?? () > #44 0x0000000000000000 in ?? () > #45 0x0000000000000000 in ?? () > #46 0x0000000000000000 in ?? () > #47 0x0000000000000000 in ?? () > #48 0x0000000000000000 in ?? () > #49 0x0000000000000000 in ?? () > #50 0x00000000007b4000 in ?? () > #51 0xffffff003d8fa980 in ?? () > #52 0xffffff00000053c0 in ?? () > #53 0x0000000000000001 in ?? () > #54 0xffffff003d8f96b0 in ?? () > #55 0xffffff001ffa4980 in ?? () > #56 0xffffffffb49a1b58 in ?? () > #57 0xffffff003d8fa980 in ?? () > #58 0xffffffff802734db in sched_switch (td=0xffffff00000053c0, newtd=0x0, > flags=0) > > then zeroes up to #130 Cheers, -- Ruslan Ermilov ru@FreeBSD.org FreeBSD committer
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200611121710.kACHAKSJ081003>