From owner-freebsd-net@FreeBSD.ORG Sat Dec 22 01:34:43 2012 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 65993B84; Sat, 22 Dec 2012 01:34:43 +0000 (UTC) (envelope-from yanegomi@gmail.com) Received: from mail-ob0-f181.google.com (mail-ob0-f181.google.com [209.85.214.181]) by mx1.freebsd.org (Postfix) with ESMTP id 10A078FC0A; Sat, 22 Dec 2012 01:34:42 +0000 (UTC) Received: by mail-ob0-f181.google.com with SMTP id oi10so5079927obb.40 for ; Fri, 21 Dec 2012 17:34:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=AuDKPsfPfAyYcLQXT7JSW//5g/py7FRN7WgfwBaFrXo=; b=wNPWLRiz+sf9dVWledFKaB1QsGNTllHxVHHYXLij/56vCm50TkqN2lMvv9AXa6oO9u 8fvQ015AXA33c9YPOAEVCiT8A0tk4xRzfOM14CtpIzrAs2pNPqVk0N+AlbPaoerezcXS TRuqs8hxsmhj7KenDTIJ6Fi2Io2+0KXB5k+2rSuhSelh5Crk6tUVr7sC5Td/E589VkHj ysNAvQ+y0gQFWrNm/9/wjzBdd1hEQpEKQNJAvlj/LtiweBoPMbNAlZw42AyTCD4swSim JZBVsvz/AuaJiIXekWCEi5AgmOq/cT1iYuMQgoqNzd2vr33w3QsiQHONfz0X6V4W/s9t Epzw== MIME-Version: 1.0 Received: by 10.60.10.133 with SMTP id i5mr716801oeb.24.1356140082417; Fri, 21 Dec 2012 17:34:42 -0800 (PST) Received: by 10.76.143.33 with HTTP; Fri, 21 Dec 2012 17:34:42 -0800 (PST) In-Reply-To: References: Date: Fri, 21 Dec 2012 17:34:42 -0800 Message-ID: Subject: Re: "Memory modified after free" - by whom? From: Garrett Cooper To: freebsd-net@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: FreeBSD Current X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Dec 2012 01:34:43 -0000 (clipping off mdf and adrian so they don't get directly spammed :)..) Crud. Continuing the processor after panic didn't work, so it might be a case of cxgbe "shot the sheriff" or something else in the stack is doing something wonky: db> c Memory modified after free 0xffffff8000405000(9216) val=0 @ 0xffffff8000405000 Memory modified after free 0xffffff800040d000(9216) val=0 @ 0xffffff800040d000 Fatal trap 1: privileged instruction fault while in kernel mode Memory modified after free 0xffffff8000409000(9216) val=0 @ 0xffffff8000409000 cpuid = 0; Fatal trap 1: privileged instruction fault while in kernel mode apic id = 00 cpuid = 1; Fatal trap 1: privileged instruction fault while in kernel mode instruction pointer = 0x20:0xffffffff80af5099 cpuid = 2; stack pointer = 0x28:0xffffff8496f41910 apic id = 04 apic id = 02 instruction pointer = 0x20:0xffffffff80af5099 instruction pointer = 0x20:0xffffffff80af5099 stack pointer = 0x28:0xffffff849705d880 frame pointer = 0x28:0xffffff8496f41940 stack pointer = 0x28:0xffffff8497067880 code segment = base 0x0, limit 0xfffff, type 0x1b frame pointer = 0x28:0xffffff849705d8b0 = DPL 0, pres 1, long 1, def32 0, gran 1 frame pointer = 0x28:0xffffff84970678b0 Fatal trap 1: privileged instruction fault while in kernel mode code segment = base 0x0, limit 0xfffff, type 0x1b code segment = base 0x0, limit 0xfffff, type 0x1b processor eflags = = DPL 0, pres 1, long 1, def32 0, gran 1 = DPL 0, pres 1, long 1, def32 0, gran 1 cpuid = 3; interrupt enabled, processor eflags = processor eflags = resume, interrupt enabled, apic id = 06 interrupt enabled, instruction pointer = 0x20:0xffffffff80af5099 resume, resume, stack pointer = 0x28:0xffffff8497071880 IOPL = 0 frame pointer = 0x28:0xffffff84970718b0 IOPL = 0 current process = code segment = base 0x0, limit 0xfffff, type 0x1b 12 (irq283: ix1:que 1) Ilock order reversal: (Giant after non-sleepable) 1st 0xfffffe009a295918 ix1:rx(1) (ix1:rx(1)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:4298 2nd 0xffffffff814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff849705d320 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff849705d3d0 witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff849705d450 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff849705d490 ukbd_poll() at ukbd_poll+0x28/frame 0xffffff849705d4b0 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff849705d4d0 cngrab() at cngrab+0x35/frame 0xffffff849705d4f0 kdb_trap() at kdb_trap+0x124/frame 0xffffff849705d550 trap_fatal() at trap_fatal+0x345/frame 0xffffff849705d5b0 trap() at trap+0x836/frame 0xffffff849705d7c0 calltrap() at calltrap+0x8/frame 0xffffff849705d7c0 --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff849705d880, rbp = 0xffffff849705d8b0 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff849705d8b0 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff849705d8e0 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff849705d960 m_getjcl() at m_getjcl+0xce/frame 0xffffff849705d9a0 ixgbe_refresh_mbufs() at ixgbe_refresh_mbufs+0x77/frame 0xffffff849705da20 ixgbe_rxeof() at ixgbe_rxeof+0x4e9/frame 0xffffff849705dad0 ixgbe_msix_que() at ixgbe_msix_que+0x9b/frame 0xffffff849705db20 intr_event_execute_handlers() at intr_event_execute_handlers+0x90/frame 0xffffff849705db60 ithread_loop() at ithread_loop+0x161/frame 0xffffff849705dbb0 fork_exit() at fork_exit+0x84/frame 0xffffff849705dbf0 fork_trampoline() at fork_trampoline+0xe/frame 0xffffff849705dbf0 --- trap 0, rip = 0, rsp = 0xffffff849705dcb0, rbp = 0 --- [ thread pid 12 tid 100224 ] Stopped at uma_find_refcnt+0x79: I setup asymmetric MTUs to see if I can narrow down whether the issue is a jumbo frame or cxgbe/ixgbe issue and so far so good. Trying again with jumbo frames paniced on boot when loading the module (really...?): # Memory modified after free 0xffffff8000401000(9216) val=0 @ 0xffffff8000401000 Fatal trap 1: privileged instruction fault while in kernel mode cpuid = 1; apic id = 02 instruction pointer = 0x20:0xffffffff80af5099 stack pointer = 0x28:0xffffff8496fc9750 frame pointer = 0x28:0xffffff8496fc9780 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 3280 (ifconfig) lock order reversal: (Giant after non-sleepable) 1st 0xfffffe00462f7808 ix0:rx(0) (ix0:rx(0)) @ /usr/src/sys/modules/ixgbe/../../dev/ixgbe/ixgbe.c:3938 2nd 0xffffffff814457b8 Giant (Giant) @ /usr/src/sys/dev/usb/input/ukbd.c:1946 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffff8496fc91f0 kdb_backtrace() at kdb_backtrace+0x39/frame 0xffffff8496fc92a0 witness_checkorder() at witness_checkorder+0xc47/frame 0xffffff8496fc9320 __mtx_lock_flags() at __mtx_lock_flags+0x89/frame 0xffffff8496fc9360 ukbd_poll() at ukbd_poll+0x28/frame 0xffffff8496fc9380 kbdmux_poll() at kbdmux_poll+0x5b/frame 0xffffff8496fc93a0 cngrab() at cngrab+0x35/frame 0xffffff8496fc93c0 kdb_trap() at kdb_trap+0x124/frame 0xffffff8496fc9420 trap_fatal() at trap_fatal+0x345/frame 0xffffff8496fc9480 trap() at trap+0x836/frame 0xffffff8496fc9690 calltrap() at calltrap+0x8/frame 0xffffff8496fc9690 --- trap 0x1, rip = 0xffffffff80af5099, rsp = 0xffffff8496fc9750, rbp = 0xffffff8496fc9780 --- uma_find_refcnt() at uma_find_refcnt+0x79/frame 0xffffff8496fc9780 mb_ctor_clust() at mb_ctor_clust+0x8f/frame 0xffffff8496fc97b0 uma_zalloc_arg() at uma_zalloc_arg+0xe7/frame 0xffffff8496fc9830 m_getjcl() at m_getjcl+0xce/frame 0xffffff8496fc9870 ixgbe_init_locked() at ixgbe_init_locked+0x62c/frame 0xffffff8496fc9930 ixgbe_ioctl() at ixgbe_ioctl+0x32d/frame 0xffffff8496fc9980 ifioctl() at ifioctl+0xc84/frame 0xffffff8496fc9a40 kern_ioctl() at kern_ioctl+0x1ce/frame 0xffffff8496fc9a90 sys_ioctl() at sys_ioctl+0x11f/frame 0xffffff8496fc9ae0 amd64_syscall() at amd64_syscall+0x265/frame 0xffffff8496fc9bf0 Xfast_syscall() at Xfast_syscall+0xfb/frame 0xffffff8496fc9bf0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80118481a, rsp = 0x7fffffffd488, rbp = 0x7fffffffd4a0 --- [ thread pid 3280 tid 100220 ] Stopped at uma_find_refcnt+0x79: Setting the MTU exhibits the same problem... just with smaller mbufs: Memory modified after free 0xfffffe014ce86800(2048) val=cc0c0001 @ 0xfffffe014ce86800 Memory modified after free 0xfffffe00bd86f000(2048) val=cc0c0001 @ 0xfffffe00bd86f000 Memory modified after free 0xfffffe00bd3a4000(2048) val=cc0c0001 @ 0xfffffe00bd3a4000 Memory modified after free 0xfffffe011a55c000(2048) val=2d290c00 @ 0xfffffe011a55c000 Memory modified after free 0xfffffe011a5fa800(2048) val=cc0c0001 @ 0xfffffe011a5fa800 Memory modified after free 0xfffffe00bd018800(2048) val=ffffffff @ 0xfffffe00bd018800 Memory modified after free 0xfffffe00bd1c4000(2048) val=7f565000 @ 0xfffffe00bd1c4000 Memory modified after free 0xfffffe011a521800(2048) val=529c0a00 @ 0xfffffe011a521800 Memory modified after free 0xfffffe011a521800(2048) val=cc0c0001 @ 0xfffffe011a521800 Memory modified after free 0xfffffe00bd0ce800(2048) val=cc0c0001 @ 0xfffffe00bd0ce800 Memory modified after free 0xfffffe00bd9a3000(2048) val=ffffffff @ 0xfffffe00bd9a3000 Memory modified after free 0xfffffe00bd8b6800(2048) val=cc0c0001 @ 0xfffffe00bd8b6800 Memory modified after free 0xfffffe00bd923000(2048) val=cc0c0001 @ 0xfffffe00bd923000 Memory modified after free 0xfffffe0188f70800(2048) val=2d290c00 @ 0xfffffe0188f70800 Memory modified after free 0xfffffe0180452000(2048) val=ffffffff @ 0xfffffe0180452000 Memory modified after free 0xfffffe014cd6b800(2048) val=cc0c0001 @ 0xfffffe014cd6b800 Memory modified after free 0xfffffe014c967800(2048) val=cc0c0001 @ 0xfffffe014c967800 Memory modified after free 0xfffffe01800d1000(2048) val=cc0c0001 @ 0xfffffe01800d1000 Memory modified after free 0xfffffe014c967800(2048) val=cc0c0001 @ 0xfffffe014c967800 Memory modified after free 0xfffffe0188fdf000(2048) val=cc0c0001 @ 0xfffffe0188fdf000 Memory modified after free 0xfffffe0188fdf000(2048) val=cc0c0001 @ 0xfffffe0188fdf000 Memory modified after free 0xfffffe00bd9a3000(2048) val=cc0c0001 @ 0xfffffe00bd9a3000 Memory modified after free 0xfffffe011a6cb000(2048) val=cc0c0001 @ 0xfffffe011a6cb000 Setting sysctl vm.memguard.desc=mbuf sends the kernel into a spiral of "use-after-frees". FWIW, I was able to repro the issue with ixgbe kldloading the module after boot sometimes, so I'm thinking that the locking/memory management in the recent set of changes is potentially off. Thanks, -Garrett PS Thanks np for the recent fix for cxgbe :).