Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 04 Sep 2007 15:21:02 +0200
From:      Ian FREISLICH <ianf@clue.co.za>
Cc:        bu7cher@yandex.ru, rwatson@freebsd.org, freebsd-current@freebsd.org
Subject:   Re: Panic in ipfw 
Message-ID:  <E1ISYL0-000158-7n@clue.co.za>
In-Reply-To: Message from Ian FREISLICH <ianf@clue.co.za>  of "Tue, 03 Jul 2007 15:06:41 %2B0200." <E1I5i5Z-0006fa-VE@clue.co.za> 

next in thread | previous in thread | raw e-mail | index | archive | help
Ian FREISLICH wrote:
> "Andrey V. Elsukov" wrote:
> > Hi, Ian.
> > 
> > > I got this panic yesterday on a fairly busy firewall.  I have some
> > > private patches to ip_fw2.c and to the em driver (see the earlier
> > > "em0 hijacking traffic to port 623" thread).  I don't think this
> > > panic is a result of those changes.
> > 
> > > It occurred round about the time an address was added to an interface.
> > 
> > I have a patch that can help you (i guess..).
> > Can you test this patch?
> > 
> > http://butcher.heavennet.ru/patches/kernel/inaddr_locking/
> 
> Thanks.  Wow, that looks like it touches a lot more than just ipfw.
> It took about 1.5 years of production at 2.3 billion backets a day
> to trigger this condition twice.  It's going to be difficult to
> tell if this patch fixes the problem.

Well, I've just had another related panic (I think):

Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 06
fault virtual address   = 0xbd
fault code              = supervisor read, page not present
instruction pointer     = 0x20:0xc059be50
stack pointer           = 0x28:0xe95ecaf0
frame pointer           = 0x28:0xe95ecaf0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, def32 1, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 36 (idlepoll)
trap number             = 12
panic: page fault
cpuid = 1
KDB: stack backtrace:
db_trace_self_wrapper(c0691f59,e95ec994,c04fe600,c06a6a38,1,...) at db_trace_sel
f_wrapper+0x26
kdb_backtrace(c06a6a38,1,c0684a3b,e95ec9a0,1,...) at kdb_backtrace+0x29
panic(c0684a3b,c06a7d15,c4e5cc8c,1,1,...) at panic+0x10f
trap_fatal(c06f28c0,0,1,0,c4ddd800,...) at trap_fatal+0x327
trap_pfault(c05895ae,c4ddd800,0,e95eca70,c4e5cab0,...) at trap_pfault+0x244
trap(e95ecab0) at trap+0x363
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc059be50, esp = 0xe95ecaf0, ebp = 0xe95ecaf0 ---
in_localip(b3c9cc29,caf3d600,e95ecb30,c57dc960,0,...) at in_localip+0x50
ip_fastforward(c5588100,e,257,c0665207,c503d800,...) at ip_fastforward+0x2b9
ether_demux(c503d800,c5588100,3,4,2,...) at ether_demux+0x165
ether_input(c503d800,c5588100,0,c5578000,800,...) at ether_input+0x3e4
vlan_input(c4dcac00,c5588100,47,e95ecc30,c046a3fd,...) at vlan_input+0x16d
ether_demux(c4dcac00,c5588100,6,47,c52f3354,...) at ether_demux+0x102
ether_input(c4dcac00,c5588100,e95ecc54,c0519523,c06f6420,...) at ether_input+0x3
e4
em_rxeof(e95e0008,c04f0028,c4e5cab0,c4e5cab0,429f1f94,...) at em_rxeof+0x45e
em_poll(c4dcac00,0,32,c4e5da00,e95eccd0,...) at em_poll+0x141
ether_poll(32,0,c068f8f0,24f,bb8,...) at ether_poll+0xd1
poll_idle(0,e95ecd38,0,0,0,...) at poll_idle+0xdb
fork_exit(c04f3fb3,0,e95ecd38) at fork_exit+0x2a
fork_trampoline() at fork_trampoline+0x8
--- trap 0, eip = 0, esp = 0xe95ecd70, ebp = 0 ---
Uptime: 12d22h28m28s
Physical memory: 2039 MB
Dumping 235 MB: 220 204 188 172 156 140 124 108 92 76 60 44 28 12

#0  doadump () at pcpu.h:195
195     pcpu.h: No such file or directory.
        in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:195
#1  0xc04fe342 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409
#2  0xc04fe62f in panic (fmt=Variable "fmt" is not available.
) at /usr/src/sys/kern/kern_shutdown.c:563
#3  0xc065b3b2 in trap_fatal (frame=0xe95ecab0, eva=189)
    at /usr/src/sys/i386/i386/trap.c:873
#4  0xc065b602 in trap_pfault (frame=0xe95ecab0, usermode=0, eva=189)
    at /usr/src/sys/i386/i386/trap.c:784
#5  0xc065bf1f in trap (frame=0xe95ecab0) at /usr/src/sys/i386/i386/trap.c:462
#6  0xc0642c8b in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc059be50 in in_localip (in={s_addr = 3016346665})
    at /usr/src/sys/netinet/in.c:125
#8  0xc05a96f7 in ip_fastforward (m=0xc5588100)
    at /usr/src/sys/netinet/ip_fastfwd.c:330
#9  0xc058ff3e in ether_demux (ifp=0xc503d800, m=0xc5588100)
    at /usr/src/sys/net/if_ethersubr.c:779
#10 0xc059041e in ether_input (ifp=0xc503d800, m=0xc5588100)
    at /usr/src/sys/net/if_ethersubr.c:701
#11 0xc0591fb1 in vlan_input (ifp=0xc4dcac00, m=0xc5588100)
    at /usr/src/sys/net/if_vlan.c:973
#12 0xc058fedb in ether_demux (ifp=0xc4dcac00, m=0xc5588100)
    at /usr/src/sys/net/if_ethersubr.c:752
#13 0xc059041e in ether_input (ifp=0xc4dcac00, m=0xc5588100)
    at /usr/src/sys/net/if_ethersubr.c:701
---Type <return> to continue, or q <return> to quit--- 
#14 0xc046ba35 in em_rxeof (adapter=0xc4d79000, count=46)
    at /usr/src/sys/dev/em/if_em.c:4301
#15 0xc046d5ef in em_poll (ifp=0xc4dcac00, cmd=POLL_ONLY, count=50)
    at /usr/src/sys/dev/em/if_em.c:1375
#16 0xc04f30a1 in ether_poll (count=50) at /usr/src/sys/kern/kern_poll.c:339
#17 0xc04f408e in poll_idle () at /usr/src/sys/kern/kern_poll.c:590
#18 0xc04de792 in fork_exit (callout=0xc04f3fb3 <poll_idle>, arg=0x0, 
    frame=0xe95ecd38) at /usr/src/sys/kern/kern_fork.c:787
#19 0xc0642d00 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:205
(kgdb) frame 7
#7  0xc059be50 in in_localip (in={s_addr = 3016346665})
    at /usr/src/sys/netinet/in.c:125
125                     if (IA_SIN(ia)->sin_addr.s_addr == in.s_addr)
(kgdb) l 
120     in_localip(struct in_addr in)
121     {
122             struct in_ifaddr *ia;
123
124             LIST_FOREACH(ia, INADDR_HASH(in.s_addr), ia_hash) {
125                     if (IA_SIN(ia)->sin_addr.s_addr == in.s_addr)
126                             return 1;
127             }
128             return 0;
129     }
(kgdb) print ia
$1 = (struct in_ifaddr *) 0x1

This code is touched by Andrey's patch.  I'm going to put that patch
into production tomorrow - this locking issue is raising it's head
too often now.

Interestingly, in my testing this patch *improves* firewall performance
in an SMP environment by ~30Kpps (10%).  I can't think why that should
be.

The test ruleset was:
00005 10203042 295891608 tee 666 ip from any to any via bge0
00010 20406493 591856418 allow ip from not me to not me
65535      193     24418 allow ip from any to any

I got 307Kpps before the patch was applied and 343Kpps with the patch.

Ian

--
Ian Freislich




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?E1ISYL0-000158-7n>