From owner-freebsd-current@FreeBSD.ORG Sat Dec 22 00:53:24 2012 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9721EF17; Sat, 22 Dec 2012 00:53:24 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-oa0-f50.google.com (mail-oa0-f50.google.com [209.85.219.50]) by mx1.freebsd.org (Postfix) with ESMTP id 2610F8FC12; Sat, 22 Dec 2012 00:53:23 +0000 (UTC) Received: by mail-oa0-f50.google.com with SMTP id n16so5328724oag.9 for ; Fri, 21 Dec 2012 16:53:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=9JgngyOvVeZLjeXc+y1V73qebFBp1j080brh42f/4EY=; b=EGqaWcI1YF9ewIqArYO3liNyUd1QCmOzNsXS6HnsB9y4NZqt+PPxgpdyT3pDKr0h2w r/lHO1yNsD6l3v7aY5PJTMQ551d3Rzb2FuHOYlgrLqSVZrOuNofUwYbseMF8Z1kY1j7y to9yA6s3nneX4ro2Owe5vhiuyMGfbWMZRIkntPVd3kBPXvrqijI+w+KG0Q2bRmqSjqfg xvhgjR1U6e3YesjGDwH3JqLw8iskYBFj1CmZ4Vy28rawis0e9nyyKSc8gUI0FCDnSRQ3 rVLCE7x5YKYKqi26UBXClqfFMCbRC3pY83XOOzN96CjjndS40MU7sqgaVEJF9KiTjs2z oSzw== MIME-Version: 1.0 Received: by 10.60.6.226 with SMTP id e2mr1074515oea.56.1356137236378; Fri, 21 Dec 2012 16:47:16 -0800 (PST) Received: by 10.76.143.33 with HTTP; Fri, 21 Dec 2012 16:47:15 -0800 (PST) In-Reply-To: References: Date: Fri, 21 Dec 2012 16:47:15 -0800 Message-ID: Subject: Re: "Memory modified after free" - by whom? From: Garrett Cooper To: mdf@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-net@freebsd.org, Adrian Chadd , FreeBSD Current X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Dec 2012 00:53:24 -0000 On Fri, Dec 21, 2012 at 4:21 PM, Garrett Cooper wrote: > On Mon, Dec 10, 2012 at 3:18 PM, wrote: >> On Mon, Dec 10, 2012 at 3:10 PM, Adrian Chadd wrote: >>> 9216 sounds like a jumbo frame mbuf. So the NIC is writing to an mbuf >>> after it's finalised/freed. >>> >>> I have a similar bug showing up on ath(4) RX. :( >> >> Compile with DEBUG_MEMGUARD in the kernel configuration, and then set >> vm.memguard.desc to the name of the UMA zone used for the 9216 byte >> allocations, mbuf_jumbo_9k. This should cause a panic when the memory >> is touched after free. > > Tada (dang, that's nifty stuff)! > > # sysctl vm.memguard.desc=mbuf_jumbo_9k > vm.memguard.descM: -> mbuf_jumboem_9k > # ory modified after free 0xffffff8000401000(9216) val=0 @ 0xffffff8000401000 > Memory modified after free 0xffffff8000405000(9216) val=0 @ 0xffffff8000405000 > Memory modified after free 0xffffff8000409000(9216) val=5a5a5a5a @ > 0xffffff8000409000 > > > > > > > Fatal trap 1: privileged instruction fault while in kernel mode > Fatal trap 1: privileged instruction fault while in kernel mode > Memory modified after free 0xffffff800040d000(9216) val=5a5a5a5a @ > 0xffffff800040d000 > Fatal trap 1: privileged instruction fault while in kernel mode > cpuid = 3; > cpuid = 1; > apic id = 02 > cpuid = 0; apic id = 06 > apic id = 00 > instruction pointer = 0x20:0xffffffff80af5099 > instruction pointer = 0x20:0xffffffff80af5099 > instruction pointer = 0x20:0xffffffff80af5099 > Fatal trap 1: privileged instruction fault while in kernel mode > stack pointer = 0x28:0xffffff8496fff880 > stack pointer = 0x28:0xffffff8496fe1880 > cpuid = 2; frame pointer = 0x28:0xffffff8496fff8b0 > frame pointer = 0x28:0xffffff8496fe18b0 > stack pointer = 0x28:0xffffff849705d880 > code segment = base 0x0, limit 0xfffff, type 0x1b > frame pointer = 0x28:0xffffff849705d8b0 > apic id = 04 > code segment = base 0x0, limit 0xfffff, type 0x1b > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > = DPL 0, pres 1, long 1, def32 0, gran 1 > instruction pointer = 0x20:0xffffffff80af5099 > processor eflags = = DPL 0, pres 1, long > 1, def32 0, gran 1 > interrupt enabled, processor eflags = stack pointer = > 0x28:0xffffff8497067880 > interrupt enabled, resume, resume, frame pointer = > 0x28:0xffffff84970678b0 > IOPL = 0 > code segment = base 0x0, limit 0xfffff, type 0x1b > current process = = DPL 0, pres 1, long > 1, def32 0, gran 1 > processor eflags = 12 (irq280: ix0:que 3) > ilock order reversal: (Giant after non-sleepable) > 1st 0xfffffe0078148b38 ix0:rx(3) (ix0:rx(3)) @ > /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4296 > 2nd 0xffffffff814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fff320 > kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fff3d0 > witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496fff450 > __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496fff490 > ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496fff4b0 > kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496fff4d0 > cngrab() at cngrab+0x35/frame 0xffffff8496fff4f0 > kdb_trap() at kdb_trap+0x124/frame 0xffffff8496fff550 > trap_fatal() at trap_fatal+0x345/frame 0xffffff8496fff5b0 > trap() at trap+0x836/frame 0xffffff8496fff7c0 > calltrap() at calltrap+0x8/frame 0xffffff8496fff7c0 > --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496fff880, rbp > = 0xffffff8496fff8b0 --- > uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496fff8b0 > mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496fff8e0 > uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496fff960 > m_getjcl() at m_getjcl+0xce/frame 0xffffff8496fff9a0 > ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xffffff8496fffa20 > ixgbe_rxeof() at ixgbe_rxeof+0x5ce/frame 0xffffff8496fffad0 > ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff8496fffb20 > intr_event_execute_handlers() at > intr_event_execute_handlers+0x90/frame 0xffffff8496fffb60 > ithread_loop() at ithread_loop+0x161/frame 0xffffff8496fffbb0 > fork_exit() at fork_exit+0x84/frame 0xffffff8496fffbf0 > fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496fffbf0 > --- trap 0, rip = 0, rsp = 0xffffff8496fffcb0, rbp = 0 --- > [ thread pid 12 tid 100218 ] > Stopped at uma_find_refcnt+0x79: > db> > > This looks interesting: > > $ git diff sys/dev/ixgbe/ixgbe.c > diff --git a/sys/dev/ixgbe/ixgbe.c b/sys/dev/ixgbe/ixgbe.c > index 40f5488..19f842b 100644 > --- a/sys/dev/ixgbe/ixgbe.c > +++ b/sys/dev/ixgbe/ixgbe.c > @@ -897,8 +897,10 @@ ixgbe_qflush(struct ifnet *ifp) > > for (int i = 0; i < adapter->num_queues; i++, txr++) { > IXGBE_TX_LOCK(txr); > - while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) > + while ((m = buf_ring_dequeue_sc(txr->br)) != NULL) { > m_freem(m); > + m = NULL; > + } > IXGBE_TX_UNLOCK(txr); > } > if_qflush(ifp); > > I'm going to recompile the driver and restart the network a few > times (that's the sticking point) and see what happens. Now the T4 is angry: ^C^C^C^C^C^C^C^C^C^C^C^C^C^C^CMemory modified after free 0xffffff8000401000(9216) val=0 @ 0xffffff8000401000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 0; apic id = 00 instruction pointer = 0x20:0xffffffff80af5099 stack pointer = 0x28:0xffffff8496f41910 frame pointer ^C = 0x28:0xffffff8496f41940 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 12 (irq265: t4nex0:1.0) lock order reversal: (Giant after non-sleepable) 1st 0xfffffe00790eb690 cxgbe1 rxq0-fl (cxgbe1 rxq0-fl) @ /usr/src/sys/modules/cxgbe/if_cxgbe/../../../dev/cxgbe/t4_sge.c:1008 2nd 0xffffffff814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496f413b0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496f41460 witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496f414e0 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496f41520 ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496f41540 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496f41560 cngrab() at cngrab+0x35/frame 0xffffff8496f41580 kdb_trap() at kdb_trap+0x124/frame 0xffffff8496f415e0 trap_fatal() at trap_fatal+0x345/frame 0xffffff8496f41640 trap() at trap+0x836/frame 0xffffff8496f41850 calltrap() at calltrap+0x8/frame 0xffffff8496f41850 --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496f41910, rbp = 0xffffff8496f41940 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496f41940 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496f41970 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496f419f0 refill_fl() at refill_fl+0x28f/frame 0xffffff8496f41a60 service_iq() at service_iq+0x821/frame 0xffffff8496f41b00 t4_intr() at t4_intr+0x2e/frame 0xffffff8496f41b20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff8496f41b60 ithread_loop() at ithread_loop+0x161/frame 0xffffff8496f41bb0 fork_exit() at fork_exit+0x84/frame 0xffffff8496f41bf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff8496f41bf0 --- trap 0, rip = 0, rsp = 0xffffff8496f41cb0, rbp = 0 --- [ thread pid 12 tid 100196 ] Stopped at uma_find_refcnt+0x79: Guess I'll rebuild the modules with the changes that np@ just made today and see if that changes anything... Thanks, -Garrett