Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Jul 2009 20:47:25 +0400
From:      Kamigishi Rei <spambox@haruhiism.net>
To:        Lawrence Stewart <lstewart@freebsd.org>
Cc:        FreeBSD Current <freebsd-current@freebsd.org>, Stefan Bethke <stb@lassitu.de>, Larry Rosenman <ler@lerctr.org>
Subject:   Re: ppp triggers GPF panic
Message-ID:  <4A5E081D.40508@haruhiism.net>
In-Reply-To: <4A59C703.4020507@freebsd.org>
References:  <128E7C52-CCBD-4BAF-A4AE-1D914A3968CB@lassitu.de>	<4A58DD8D.3090308@freebsd.org>	<6D58BB3C-85F4-44A6-A43B-F6E18F056FA4@lassitu.de>	<4A598DDF.4010306@freebsd.org>	<6C047344-397E-4F14-97F1-C61FD80AAC3F@lassitu.de>	<4A59BB11.70706@freebsd.org>	<76EFB2CC-1ADE-4AFF-82FC-0461C92122A9@lassitu.de> <4A59C703.4020507@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Lawrence Stewart wrote:
> In the meantime, I'm going to try figure out how to reproduce this. 
> I'll keep everyone notified of any progress.
I've found the revision at which the issue strikes for me, and a nice 
way to reproduce it. Note: In all cases and for all revisions, I'm using 
the patched version of tcp_sack.c (see r195655 @ 
http://svn.freebsd.org/viewvc/base/head/sys/netinet/tcp_sack.c?r1=190948&r2=195655 
<http://svn.freebsd.org/viewvc/base/head/sys/netinet/tcp_sack.c?r1=190948&r2=195655>).
I would also like to know if someone manages to reproduce the panic 
using my instructions below.

r195136 works pretty stable.
r195146 crashes instantly upon getting 2 aliases on lo0, running iperf 
server in jail 0, and then doing "iperf -c xxx.xxx.xxx.1 -t YY -N -P 10" 
where YY is >10 from jail 1 started on lo0 alias xxx.xxx.xxx.3. This 
triggers the panic in just a second or two after iperf is started.

I didn't check if it works outside of jails yet.

r195634 is stable, r195484 is not.

More information:

System is a Core2 Duo 3.00GHz on Gigabyte GA-Q35M S2 board with the SATA 
controller running in AHCI mode, memory in dual channel DDR2-800 mode 
(panic triggers in 2 and 4GB RAM configurations, didn't check other 
variants). 2 NICs are installed, em0 and re0, em0 is constantly sending 
at 10-25 Mbps average. Web and database servers run in separate jails 
and communicate via aliases on lo0 - that's how I first got the panic to 
happen.

Kernel config is GENERIC from May 09 snapshot with the following 
additional options:

options         IPFILTER        # IPFilter
options         IPFILTER_LOG    # IPFilter logging
options         IPFIREWALL      # IPFW2
options         IPFIREWALL_DEFAULT_TO_ACCEPT
options         DUMMYNET        # IPFW Traffic Shaper
options         DEVICE_POLLING  # Polling support for NICs etc
options         KDB_UNATTENDED  # Automagically dump and reboot+savecore 
after a panic

and without WITNESS.

Crash info:

kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x14ee288
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80586265
stack pointer           = 0x28:0xffffff80787525c0
frame pointer           = 0x28:0xffffff80787525f0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 2119 (iperf)
trap number             = 12
panic: page fault
cpuid = 0
Uptime: 50s
Physical memory: 4014 MB

#0  doadump () at pcpu.h:223
223     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0xffffffff805950b3 in boot (howto=260)
    at /usr/src/sys/kern/kern_shutdown.c:419
#2  0xffffffff8059550c in panic (fmt=Variable "fmt" is not available.
)
    at /usr/src/sys/kern/kern_shutdown.c:575
#3  0xffffffff8085d95d in trap_fatal (frame=0xc, eva=Variable "eva" is 
not available.
)
    at /usr/src/sys/amd64/amd64/trap.c:852
#4  0xffffffff8085e5f5 in trap (frame=0xffffff8078752510)
    at /usr/src/sys/amd64/amd64/trap.c:345
#5  0xffffffff808448e3 in calltrap ()
    at /usr/src/sys/amd64/amd64/exception.S:223
#6  0xffffffff80586265 in _mtx_lock_sleep (m=0xffffffff80e60823,
    tid=18446742976740145840, opts=Variable "opts" is not available.
) at /usr/src/sys/kern/kern_mutex.c:407
#7  0xffffffff805863be in _mtx_lock_flags (m=Variable "m" is not available.
)
    at /usr/src/sys/kern/kern_mutex.c:203
#8  0xffffffff80642a95 in netisr_queue_internal (proto=1,
    m=0xffffff00c5b69a00, cpuid=Variable "cpuid" is not available.
) at /usr/src/sys/net/netisr.c:829
#9  0xffffffff80642b79 in netisr_queue_src (proto=1, source=Variable 
"source" is not available.
)
    at /usr/src/sys/net/netisr.c:859
#10 0xffffffff8063ead9 in if_simloop (ifp=0xffffff0004898800,
    m=0xffffff00c5b69a00, af=2, hlen=0) at /usr/src/sys/net/if_loop.c:400
#11 0xffffffff8063ec36 in looutput (ifp=0xffffff0004898800,
    m=0xffffff00c5b69a00, dst=0xffffff8078752770, ro=Variable "ro" is 
not available.
)
    at /usr/src/sys/net/if_loop.c:296
#12 0xffffffff8069dc17 in ip_output (m=0xffffff00c5b69a00, opt=Variable 
"opt" is not available.
)
    at /usr/src/sys/netinet/ip_output.c:624
#13 0xffffffff80703274 in tcp_output (tp=0xffffff00c56a0b60)
    at /usr/src/sys/netinet/tcp_output.c:1188
#14 0xffffffff8070de6b in tcp_usr_rcvd (so=Variable "so" is not available.
) at tcp_offload.h:280
#15 0xffffffff805f9992 in soreceive_generic (so=0xffffff00c56add48,
    psa=0xffffff8078752a78, uio=0xffffff8078752a40, mp0=Variable "mp0" 
is not available.
)
    at /usr/src/sys/kern/uipc_socket.c:1840
#16 0xffffffff805fd99e in kern_recvit (td=0xffffff0097873ab0, s=4,
    mp=0xffffff8078752af0, fromseg=UIO_USERSPACE, controlp=0x0)
    at /usr/src/sys/kern/uipc_syscalls.c:970
#17 0xffffffff805fdb41 in recvit (td=Variable "td" is not available.
)
    at /usr/src/sys/kern/uipc_syscalls.c:1082
#18 0xffffffff805fdcc2 in recvfrom (td=0xffffff0097873ab0,
    uap=0xffffff8078752c00) at /usr/src/sys/kern/uipc_syscalls.c:1126
#19 0xffffffff8085de9f in syscall (frame=0xffffff8078752c90)
    at /usr/src/sys/amd64/amd64/trap.c:984
#20 0xffffffff80844b70 in Xfast_syscall ()
    at /usr/src/sys/amd64/amd64/exception.S:364
#21 0x0000000800c5074c in ?? ()
Previous frame inner to this frame (corrupt stack?)

--
Kamigishi Rei
KREI-RIPE



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4A5E081D.40508>