From owner-freebsd-current@FreeBSD.ORG  Sat Sep  4 19:22:21 2004
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: freebsd-current@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 94B5916A4D5; Sat,  4 Sep 2004 19:22:21 +0000 (GMT)
Received: from miranda.expro.pl (mail2.expro.pl [193.25.166.6])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id BECF043D49; Sat,  4 Sep 2004 19:22:19 +0000 (GMT)
	(envelope-from winfried@miranda.expro.pl)
Received: by miranda.expro.pl (Postfix, from userid 1001)
	id C8787153DE; Sat,  4 Sep 2004 21:22:18 +0200 (CEST)
Date: Sat, 4 Sep 2004 21:22:18 +0200
From: Jan Srzednicki <w@expro.pl>
To: Robert Watson <rwatson@freebsd.org>
Message-ID: <20040904192218.GA5879@miranda.expro.pl>
References: <20040904140257.GC51038@miranda.expro.pl>
	<Pine.NEB.3.96L.1040904133421.89683B-100000@fledge.watson.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-2
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <Pine.NEB.3.96L.1040904133421.89683B-100000@fledge.watson.org>
User-Agent: Mutt/1.5.6i
cc: current@freebsd.org
Subject: Re: syscons problem in ddb, if_afdata initialization (was: Re:
	5.3-BETA3 , panic, probably IPv6+SMP+mpsafenet related)
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 04 Sep 2004 19:22:21 -0000

On Sat, Sep 04, 2004 at 01:51:06PM -0400, Robert Watson wrote:
> 
> The two sentence summary: this appears to involve three problems, one
> known.  There's a lock order reversal involving TCP and the "uid/gid" 
> layering violation in pf/ipfw, a new issue reported regarding if_afdata
> dereferencing from nd6_slowtimo, and an issue involving syscons calling
> into the tty code from it's low level DDB console API (should never
> happen). 

Well, if you need more data from me, I'm willing to send them. The panic
is easily reproduceable (it just happens after an hour of uptime).

> On Sat, 4 Sep 2004, Jan Srzednicki wrote:
> 
> > I'm running a fresh (cvsupped just after the BETA3 commit in newvers.sh) 
> > RELENG_5 system. It's a SMP with 2 processors, it's doing some
> > home-routing with pf. pf and some netgraph stuff is compiled into
> > kernel, otherwise it's mostly GENERIC. 
> > 
> > It seems to me that the panic is IPv6 and mpsafenet (and probably SMP) 
> > related. It happens about an hour after a boot (happened twice, so it's
> > repeatable). 
> 
> Saw your follow-up e-mail which suggests mpsafenet is not at cause.
> 
> > http://wrzask.pl/stuff/smieci/vmcore.2.dmesg
> 
[..]
> % --- trap 0x1, eip = 0, esp = 0xd51b0d7c, ebp = 0 ---
> % lock order reversal
> %  1st 0xc0808240 ifnet (ifnet) @ /usr/src/sys/netinet6/nd6.c:1777
> %  2nd 0xc07d36c4 user map (user map) @ /usr/src/sys/vm/vm_map.c:2997
> % KDB: stack backtrace:
> % witness_checkorder(c07d36c4,9,c0787eb3,bb5,0) at witness_checkorder+0x5bb
> % _sx_xlock(c07d36c4,c0787eb3,bb5,1000108,0) at _sx_xlock+0x5d
> % vm_map_lookup(d57e3b5c,0,1,d57e3b60,d57e3b50) at vm_map_lookup+0x3a
> % vm_fault(c07d3680,0,1,0,c12e8580) at vm_fault+0x7a
> % trap_pfault(8,d57e3c50,c05bd344,c07e2aa0,8) at trap_pfault+0x13e
> % trap(c0770018,10,c07e0010,3a00,c1405c00) at trap+0x363
> % calltrap() at calltrap+0x5
> % --- trap 0xc, eip = 0xc06677a6, esp = 0xd57e3c84, ebp = 0xd57e3ca0 ---
> 
> This is the real failure down here, the kernel page faulting; the lock
> order reversal is because the VM code tries to process the page fault
> running into lock order problems.
> 
> What follows though is a bit hard to read.
> 
> % nd6_slowtimo(0,8,c0774781,f7,57e49) at nd6_slowtimo+0x66
> % softclock(0,0,c0770f78,268,d57e3d10) at softclock+0x12f
> % ithread_loop(c12b9c80,d57e3d48,c0770d4d,32c,0) at ithread_loop+0x1fd
> % fork_exit(c057dab0,c12b9c80,d57e3d48) at fork_exit+0xa9
> % fork_trampoline() at fork_trampoline+0x8
> % --- trap 0x1, eip = 0, esp = 0xd57e3d7c, ebp = 0 ---
> 
> It looks like this is a cascading problem that started with dereferencing
> ND_IFINFO(ifp) in nd6_slowtimo().  if_afdata is initialized on demand by
> if_attachdomain1().  It would be interesting to know the contents of *ifp
> in the nd6_slowtimo() frame.  This might be a race/bug involving if_afdata
> initialization...

OK. I'm not sure if I'm doing that the right way, but here it is:

(kgdb) select-frame 22 (that's the one with nd6_slowtimo() in this dump)
(kgdb) print ifp
$1 = (struct ifnet *) 0xc1405c00
(kgdb) print *ifp
$2 = {if_softc = 0xc1405c00, if_link = {tqe_next = 0xc140d000, 
    tqe_prev = 0xc13fa004}, 
  if_xname = "pflog0\000\000\000\000\000\000\000\000\000", 
  if_dname = 0xc075e3a3 "pflog", if_dunit = 0, if_addrhead = {
    tqh_first = 0xc140e000, tqh_last = 0xc140e060}, if_klist = {
    kl_lock = 0xc07d6a20, kl_list = {slh_first = 0x0}}, if_pcount = 1, 
  if_carp = 0x0, if_bpf = 0xc14064c0, if_index = 5, if_timer = 0, 
  if_nvlans = 0, if_flags = 321, if_capabilities = 0, if_capenable = 0, 
  if_linkmib = 0x0, if_linkmiblen = 0, if_data = {ifi_type = 246 'ö', 
    ifi_physical = 0 '\0', ifi_addrlen = 0 '\0', ifi_hdrlen = 48 '0', 
    ifi_link_state = 0 '\0', ifi_recvquota = 0 '\0', ifi_xmitquota = 0 '\0', 
    ifi_mtu = 33208, ifi_metric = 0, ifi_baudrate = 0, ifi_ipackets = 0, 
    ifi_ierrors = 0, ifi_opackets = 0, ifi_oerrors = 0, ifi_collisions = 0, 
    ifi_ibytes = 0, ifi_obytes = 0, ifi_imcasts = 0, ifi_omcasts = 0, 
    ifi_iqdrops = 0, ifi_noproto = 0, ifi_hwassist = 0, ifi_unused = 0, 
    ifi_lastchange = {tv_sec = 1094301842, tv_usec = 136803}}, 
  if_multiaddrs = {tqh_first = 0x0, tqh_last = 0xc1405cac}, if_amcount = 0, 
  if_output = 0xc0484a80 <pflogoutput>, if_input = 0, 
  if_start = 0xc04849c0 <pflogstart>, if_ioctl = 0xc0484ac0 <pflogioctl>, 
  if_watchdog = 0, if_init = 0, if_resolvemulti = 0, if_snd = {ifq_head = 0x0, 
    ifq_tail = 0x0, ifq_len = 0, ifq_maxlen = 50, ifq_drops = 0, ifq_mtx = {
      mtx_object = {lo_class = 0xc07ad4e4, lo_name = 0xc1405c0c "pflog0", 
        lo_type = 0xc077c33e "if send queue", lo_flags = 196608, lo_list = {
          tqe_next = 0xc140e07c, tqe_prev = 0xc1405e04}, 
---Type <return> to continue, or q <return> to quit---
        lo_witness = 0xc07e3530}, mtx_lock = 4, mtx_recurse = 0}, 
    ifq_drv_head = 0x0, ifq_drv_tail = 0x0, ifq_drv_len = 0, 
    ifq_drv_maxlen = 0, altq_type = 0, altq_flags = 0, altq_disc = 0x0, 
    altq_ifp = 0xc1405c00, altq_enqueue = 0, altq_dequeue = 0, 
    altq_request = 0, altq_clfier = 0x0, altq_classify = 0, altq_tbr = 0x0, 
    altq_cdnr = 0x0}, if_broadcastaddr = 0x0, lltables = 0x0, if_label = 0x0, 
  if_prefixhead = {tqh_first = 0x0, tqh_last = 0xc1405d54}, if_afdata = {
    0x0 <repeats 37 times>}, if_afdata_initialized = 1, if_afdata_mtx = {
    mtx_object = {lo_class = 0xc07ad4e4, lo_name = 0xc077c32e "if_afdata", 
      lo_type = 0xc077c32e "if_afdata", lo_flags = 196608, lo_list = {
        tqe_next = 0xc1405ce8, tqe_prev = 0xc079cba0}, 
      lo_witness = 0xc07e3558}, mtx_lock = 4, mtx_recurse = 0}, 
  if_starttask = {ta_link = {stqe_next = 0x0}, ta_pending = 0, 
    ta_priority = 0, ta_func = 0xc060cc10 <if_start_deferred>, 
    ta_context = 0xc1405c00}}

Hope this helps.

It's also worth noting that on the previous kernel (that is: FreeBSD
5.3-BETA2 (SIN) #9: Sun Aug 29 14:24:19 CEST 2004) nothing like that
happened, so it appears to be somehow related to (or uncovered by) some
very recent commit.

> % Fatal trap 12: page fault while in kernel mode
> % cpuid = 0; apic id = 01
> % fault virtual address	= 0x8
> % fault code		= supervisor read, page not present
> % instruction pointer	= 0x8:0xc06677a6
> % stack pointer	        = 0x10:0xd57e3c84
> % frame pointer	        = 0x10:0xd57e3ca0
> % code segment		= base 0x0, limit 0xfffff, type 0x1b
> % 			= DPL 0, pres 1, def32 1, gran 1
> % processor eflags	= interrupt enabled, resume, IOPL = 0
> % current process		= 36 (swi5: clock sio)
> % panic: blockable sleep lock (sleep mutex) tty @ /usr/src/sys/kern/kern_event.c:1450
> % cpuid = 0
> % boot() called on cpu#0
> % Uptime: 1h0m18s
> % Dumping 127 MB
> %  16 32 48 64 80 96 112
> 
> 
> This one is pretty unfortunate, but hopefully easily fixable.  Here's my
> interpretation:
> 
> % #0  doadump () at pcpu.h:159
> % #1  0xc0595216 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:396
> % #2  0xc0594ceb in panic (fmt=0xc0776fd5 "blockable sleep lock (%s) %s @ %s:%d")
> %     at /usr/src/sys/kern/kern_shutdown.c:552
> % #3  0xc05bd589 in witness_checkorder (lock=0xc13fd910, flags=9, 
> %     file=0xc0770940 "/usr/src/sys/kern/kern_event.c", line=1450)
> %     at /usr/src/sys/kern/subr_witness.c:703
> % #4  0xc058b374 in _mtx_lock_flags (m=0xc13fd910, opts=0, 
> %     file=0xc0770940 "/usr/src/sys/kern/kern_event.c", line=1450)
> %     at /usr/src/sys/kern/kern_mutex.c:261
> % #5  0xc0575559 in knote (list=0xc13fd898, hint=0, islocked=0)
> %     at /usr/src/sys/kern/kern_event.c:1450
> % #6  0xc05cb793 in ttwwakeup (tp=0xc13fd800) at /usr/src/sys/kern/tty.c:2394
> % #7  0xc0715ef8 in scstart (tp=0xc13fd800)
> %     at /usr/src/sys/dev/syscons/syscons.c:1367
> % #8  0xc0716489 in scgetc (sc=0xc082ef60, flags=3)
> %     at /usr/src/sys/dev/syscons/syscons.c:3209
> % #9  0xc07167de in sccngetch (flags=3)
> %     at /usr/src/sys/dev/syscons/syscons.c:1553
> % #10 0xc05d0ad9 in cncheckc () at /usr/src/sys/kern/tty_cons.c:567
> % #11 0xc05d0b15 in cngetc () at /usr/src/sys/kern/tty_cons.c:548
> % #12 0xc04ab335 in db_readline (lstart=0xc07cea00 "", lsize=120)
> 
> The polled console support used by DDB has lead somehow to a call to
> ttwwakeup(), which should never happen (TM) using the polled interface.
> This is because we can't run all the TTY level code from the debugger,
> only low level polled console operation.  I'm not sure who to point at for 

The hint here would be that after I got the ddb prompt I tried to scroll
up some lines (scroll lock + page up) and then the (second) panic
occured.

-- 
Jan 'Winfried' Srzednicki
w@expro.pl