From owner-freebsd-current@FreeBSD.ORG Wed Jul 15 16:47:29 2009 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66741106566C; Wed, 15 Jul 2009 16:47:29 +0000 (UTC) (envelope-from spambox@haruhiism.net) Received: from fujibayashi.jp (karas.fujibayashi.jp [77.221.159.4]) by mx1.freebsd.org (Postfix) with ESMTP id 197638FC1A; Wed, 15 Jul 2009 16:47:28 +0000 (UTC) (envelope-from spambox@haruhiism.net) Received: from [192.168.0.2] (ppp91-122-47-189.pppoe.avangarddsl.ru [91.122.47.189]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by fujibayashi.jp (Postfix) with ESMTPSA id BF81278D72; Wed, 15 Jul 2009 20:47:26 +0400 (MSD) Message-ID: <4A5E081D.40508@haruhiism.net> Date: Wed, 15 Jul 2009 20:47:25 +0400 From: Kamigishi Rei User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: Lawrence Stewart References: <128E7C52-CCBD-4BAF-A4AE-1D914A3968CB@lassitu.de> <4A58DD8D.3090308@freebsd.org> <6D58BB3C-85F4-44A6-A43B-F6E18F056FA4@lassitu.de> <4A598DDF.4010306@freebsd.org> <6C047344-397E-4F14-97F1-C61FD80AAC3F@lassitu.de> <4A59BB11.70706@freebsd.org> <76EFB2CC-1ADE-4AFF-82FC-0461C92122A9@lassitu.de> <4A59C703.4020507@freebsd.org> In-Reply-To: <4A59C703.4020507@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Current , Stefan Bethke , Larry Rosenman Subject: Re: ppp triggers GPF panic X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Jul 2009 16:47:29 -0000 Lawrence Stewart wrote: > In the meantime, I'm going to try figure out how to reproduce this. > I'll keep everyone notified of any progress. I've found the revision at which the issue strikes for me, and a nice way to reproduce it. Note: In all cases and for all revisions, I'm using the patched version of tcp_sack.c (see r195655 @ http://svn.freebsd.org/viewvc/base/head/sys/netinet/tcp_sack.c?r1=190948&r2=195655 ). I would also like to know if someone manages to reproduce the panic using my instructions below. r195136 works pretty stable. r195146 crashes instantly upon getting 2 aliases on lo0, running iperf server in jail 0, and then doing "iperf -c xxx.xxx.xxx.1 -t YY -N -P 10" where YY is >10 from jail 1 started on lo0 alias xxx.xxx.xxx.3. This triggers the panic in just a second or two after iperf is started. I didn't check if it works outside of jails yet. r195634 is stable, r195484 is not. More information: System is a Core2 Duo 3.00GHz on Gigabyte GA-Q35M S2 board with the SATA controller running in AHCI mode, memory in dual channel DDR2-800 mode (panic triggers in 2 and 4GB RAM configurations, didn't check other variants). 2 NICs are installed, em0 and re0, em0 is constantly sending at 10-25 Mbps average. Web and database servers run in separate jails and communicate via aliases on lo0 - that's how I first got the panic to happen. Kernel config is GENERIC from May 09 snapshot with the following additional options: options IPFILTER # IPFilter options IPFILTER_LOG # IPFilter logging options IPFIREWALL # IPFW2 options IPFIREWALL_DEFAULT_TO_ACCEPT options DUMMYNET # IPFW Traffic Shaper options DEVICE_POLLING # Polling support for NICs etc options KDB_UNATTENDED # Automagically dump and reboot+savecore after a panic and without WITNESS. Crash info: kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x14ee288 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80586265 stack pointer = 0x28:0xffffff80787525c0 frame pointer = 0x28:0xffffff80787525f0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 2119 (iperf) trap number = 12 panic: page fault cpuid = 0 Uptime: 50s Physical memory: 4014 MB #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) #0 doadump () at pcpu.h:223 #1 0xffffffff805950b3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff8059550c in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:575 #3 0xffffffff8085d95d in trap_fatal (frame=0xc, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:852 #4 0xffffffff8085e5f5 in trap (frame=0xffffff8078752510) at /usr/src/sys/amd64/amd64/trap.c:345 #5 0xffffffff808448e3 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:223 #6 0xffffffff80586265 in _mtx_lock_sleep (m=0xffffffff80e60823, tid=18446742976740145840, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:407 #7 0xffffffff805863be in _mtx_lock_flags (m=Variable "m" is not available. ) at /usr/src/sys/kern/kern_mutex.c:203 #8 0xffffffff80642a95 in netisr_queue_internal (proto=1, m=0xffffff00c5b69a00, cpuid=Variable "cpuid" is not available. ) at /usr/src/sys/net/netisr.c:829 #9 0xffffffff80642b79 in netisr_queue_src (proto=1, source=Variable "source" is not available. ) at /usr/src/sys/net/netisr.c:859 #10 0xffffffff8063ead9 in if_simloop (ifp=0xffffff0004898800, m=0xffffff00c5b69a00, af=2, hlen=0) at /usr/src/sys/net/if_loop.c:400 #11 0xffffffff8063ec36 in looutput (ifp=0xffffff0004898800, m=0xffffff00c5b69a00, dst=0xffffff8078752770, ro=Variable "ro" is not available. ) at /usr/src/sys/net/if_loop.c:296 #12 0xffffffff8069dc17 in ip_output (m=0xffffff00c5b69a00, opt=Variable "opt" is not available. ) at /usr/src/sys/netinet/ip_output.c:624 #13 0xffffffff80703274 in tcp_output (tp=0xffffff00c56a0b60) at /usr/src/sys/netinet/tcp_output.c:1188 #14 0xffffffff8070de6b in tcp_usr_rcvd (so=Variable "so" is not available. ) at tcp_offload.h:280 #15 0xffffffff805f9992 in soreceive_generic (so=0xffffff00c56add48, psa=0xffffff8078752a78, uio=0xffffff8078752a40, mp0=Variable "mp0" is not available. ) at /usr/src/sys/kern/uipc_socket.c:1840 #16 0xffffffff805fd99e in kern_recvit (td=0xffffff0097873ab0, s=4, mp=0xffffff8078752af0, fromseg=UIO_USERSPACE, controlp=0x0) at /usr/src/sys/kern/uipc_syscalls.c:970 #17 0xffffffff805fdb41 in recvit (td=Variable "td" is not available. ) at /usr/src/sys/kern/uipc_syscalls.c:1082 #18 0xffffffff805fdcc2 in recvfrom (td=0xffffff0097873ab0, uap=0xffffff8078752c00) at /usr/src/sys/kern/uipc_syscalls.c:1126 #19 0xffffffff8085de9f in syscall (frame=0xffffff8078752c90) at /usr/src/sys/amd64/amd64/trap.c:984 #20 0xffffffff80844b70 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:364 #21 0x0000000800c5074c in ?? () Previous frame inner to this frame (corrupt stack?) -- Kamigishi Rei KREI-RIPE