From owner-freebsd-net@FreeBSD.ORG  Tue Sep  8 15:50:13 2009
Return-Path: <owner-freebsd-net@FreeBSD.ORG>
Delivered-To: freebsd-net@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id EBD341065676
	for <freebsd-net@freebsd.org>; Tue,  8 Sep 2009 15:50:12 +0000 (UTC)
	(envelope-from gnn@neville-neil.com)
Received: from mail.meer.net (mail.meer.net [64.13.141.3])
	by mx1.freebsd.org (Postfix) with ESMTP id D05798FC16
	for <freebsd-net@freebsd.org>; Tue,  8 Sep 2009 15:50:12 +0000 (UTC)
Received: from mail2.meer.net (mail2.meer.net [64.13.141.16])
	by mail.meer.net (8.13.3/8.13.3/meer) with ESMTP id n88FjNvO001398;
	Tue, 8 Sep 2009 08:45:23 -0700 (PDT)
	(envelope-from gnn@neville-neil.com)
Received: from gnnmac.hudson-trading.com (209.249.190.8.available.above.net
	[209.249.190.8] (may be forged)) (authenticated bits=0)
	by mail2.meer.net (8.14.1/8.14.3) with ESMTP id n88FjMxY030721
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO);
	Tue, 8 Sep 2009 08:45:23 -0700 (PDT)
	(envelope-from gnn@neville-neil.com)
Mime-Version: 1.0 (Apple Message framework v1075.2)
Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes
From: George Neville-Neil <gnn@neville-neil.com>
In-Reply-To: <20090907191001.GA37291@hal.rescomp.berkeley.edu>
Date: Tue, 8 Sep 2009 11:45:22 -0400
Content-Transfer-Encoding: 7bit
Message-Id: <54FDC10A-EAE3-4AE2-BF36-2C5F7D141C3A@neville-neil.com>
References: <20090904223123.GD16213@hal.rescomp.berkeley.edu>
	<723505E9-96C6-401C-A844-3D9BA2033795@neville-neil.com>
	<20090907191001.GA37291@hal.rescomp.berkeley.edu>
To: Chris Cowart <ccowart@rescomp.berkeley.edu>
X-Mailer: Apple Mail (2.1075.2)
Cc: freebsd-net@freebsd.org
Subject: Re: Crash in ether_input
X-BeenThere: freebsd-net@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Networking and TCP/IP with FreeBSD <freebsd-net.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-net>
List-Post: <mailto:freebsd-net@freebsd.org>
List-Help: <mailto:freebsd-net-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-net>,
	<mailto:freebsd-net-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Sep 2009 15:50:13 -0000


On Sep 7, 2009, at 15:10 , Chris Cowart wrote:

> George Neville-Neil wrote:
>> On Sep 4, 2009, at 18:31 , Chris Cowart wrote:
>>> Starting about a week ago, our primary webserver (then running  
>>> FreeBSD
>>> 7.0) began crashing several times a day, typically during our
>>> higher-load times of day. We have since upgraded to 7.1p7, but
>>> continued
>>> to see the frequent crashes.
>>>
>>> We are running an apache22 webserver with a lot of perl, logging via
>>> syslog-ng, and using IPSec in transport mode between the webserver  
>>> and
>>> both the fileserver and logserver. Everything is IPv4.
>>>
>>> From uname:
>>>
>>> | FreeBSD mug.rescomp.berkeley.edu 7.1-RELEASE-p7 FreeBSD 7.1-
>>> RELEASE-p7
>>> | #0: Wed Sep  2 17:56:59 PDT 2009
>>> | root@mug.rescomp.berkeley.edu:/usr/obj/usr/src/sys/GENERIC  amd64
>>>
>>> Some information that appears typical across many crashes:
>>>
>>> | Unread portion of the kernel message buffer:
>>> |
>>> | Fatal trap 27: stack fault while in kernel mode
>>> | cpuid = 0; apic id = 00
>>> | instruction pointer     = 0x8:0xffffffff80559fb4
>>> | stack pointer           = 0x10:0xffffffffae39faf0
>>> | frame pointer           = 0x10:0xf85ecc37f9239402
>>> | code segment            = base 0x0, limit 0xfffff, type 0x1b
>>> |                         = DPL 0, pres 1, long 1, def32 0, gran 1
>>> | processor eflags        = interrupt enabled, resume, IOPL = 0
>>> | current process         = 27 (em0 taskq)
>>> | trap number             = 27
>>> | panic: stack fault
>>> | cpuid = 0
>>> | Uptime: 43m44s
>>> | Physical memory: 4082 MB
>>> | Dumping 361 MB: 346 330 314 298 282 266 250 234 218 202em0:
>>> watchdog timeout -- resetting
>>>
>>> | (kgdb) bt
>>> | #0  doadump () at pcpu.h:195
>>> | #1  0x0000000000000004 in ?? ()
>>> | #2  0xffffffff804bd9b9 in boot (howto=260) at /usr/src/sys/kern/
>>> kern_shutdown.c:418
>>> | #3  0xffffffff804bddc2 in panic (fmt=0x104 <Address 0x104 out of
>>> bounds>) at /usr/src/sys/kern/kern_shutdown.c:574
>>> | #4  0xffffffff807b9f23 in trap_fatal (frame=0xffffff00012d66e0,
>>> eva=Variable "eva" is not available.
>>> | ) at /usr/src/sys/amd64/amd64/trap.c:764
>>> | #5  0xffffffff807baa75 in trap (frame=0xffffffffae39fa40) at /usr/
>>> src/sys/amd64/amd64/trap.c:565
>>> | #6  0xffffffff807a042e in calltrap () at /usr/src/sys/amd64/amd64/
>>> exception.S:209
>>> | #7  0xffffffff80559fb4 in ether_input (ifp=0xffffff00012bf000,
>>> m=0xffffff0003576000) at /usr/src/sys/net/if_ethersubr.c:545
>>> | #8  0xffffffff802bd645 in em_rxeof (adapter=0xffffffff80e4c000,
>>> count=99) at /usr/src/sys/dev/e1000/if_em.c:4539
>>> | #9  0xffffffff802be55e in em_handle_rxtx (context=Variable
>>> "context" is not available.
>>> | ) at /usr/src/sys/dev/e1000/if_em.c:1702
>>> | #10 0xffffffff804f2afd in taskqueue_run (queue=0xffffff00012c8c80)
>>> at /usr/src/sys/kern/subr_taskqueue.c:282
>>> | #11 0xffffffff804f2da6 in taskqueue_thread_loop (arg=Variable
>>> "arg" is not available.
>>> | ) at /usr/src/sys/kern/subr_taskqueue.c:401
>>> | #12 0xffffffff8049b2f3 in fork_exit (callout=0xffffffff804f2d40
>>> <taskqueue_thread_loop>, arg=0xffffffff80e50588,
>>> frame=0xffffffffae39fc80) at /usr/src/sys/kern/kern_fork.c:804
>>> | #13 0xffffffff807a07fe in fork_trampoline () at /usr/src/sys/ 
>>> amd64/
>>> amd64/exception.S:455
>>> | #14 0x0000000000000000 in ?? ()
>>> | #15 0x0000000000000000 in ?? ()
>>> | #16 0x0000000000000001 in ?? ()
>>> [...]
>>>
>>> | (kgdb) source debug/gdb6
>>> | (kgdb) frame 7
>>> | #7  0xffffffff80559fb4 in ether_input (ifp=0xffffff00012bf000,
>>> m=0xffffff0003576000) at /usr/src/sys/net/if_ethersubr.c:545
>>> | 545             eh = mtod(m, struct ether_header *);
>>> | (kgdb) info locals
>>> | eh = (struct ether_header *) 0xf85ecc37f9239402
>>> | (kgdb) info args
>>> | ifp = (struct ifnet *) 0xffffff00012bf000
>>> | m = (struct mbuf *) 0xffffff0003576000
>>> | (kgdb) mbuf m
>>> | 0xffffff0003576000: 125 bytes ext 0xaf29dcb45d53e701 packet: 125
>>> bytes received via em0
>>> | 0xbb763383e10eda22Cannot access memory at address  
>>> 0xbb763383e10eda3a
>>> | (kgdb)
>>>
>>> If anyone can provide some points on other things I can try to get
>>> useful data out of these core dumps, I'm open to it.
>>>
>>> We did decide to stop mounting NFS, upgrade to syslog-ng3 (which
>>> supports TLS), and revert the webserver back to a GENERIC kernel.
>>> Since
>>> booting the GENERIC kernel, the system has been up for nearly 2  
>>> days.
>>>
>>> Right now, we're logging via TLS to a temporary/testing logserver.
>>> That
>>> logserver is one of our default builds with IPSec. It is  
>>> configured to
>>> forward logs over udp/syslog (via IPSec in transport mode) to our
>>> primary logserver.
>>>
>>> Within hours of beginning to pass the production webserver's logs
>>> through this temporary logserver (and thus having its syslog-ng
>>> forward
>>> to the primary logserver), the temporary logserver began exhibiting
>>> the
>>> same behavior that the webserver was previously showing.
>>>
>>> We're totally grasping at straws here, but it's looking like some  
>>> kind
>>> of bug related to IPSec. Maybe related to long messages? High volume
>>> of
>>> messages?
>>>
>>> We would love to get this hammered out, so please let me know if
>>> there's
>>> any debugging we can perform or patches we can try.
>>>
>>
>> Hi Chris,
>>
>> Sorry to hear y'all are having problems.  You mention that you
>> switched to GENERIC?  Can you send out a copy of your modified kernel
>> config?  I'm  a bit confused because the kernel panic you do show
>> looks like it's from a GENERIC  kernel.
>
> Sorry, the uname was from a stable boot after we had switched to
> GENERIC.
>
> The config from the crashing kernel:
>
> | include		GENERIC
> |
> | ident		RCBSD_REL7
> |
> | # Enabling IPSec (see SysAdmin.IPSec in TWiki)
> | options         IPSEC
> | options         IPSEC_FILTERTUNNEL
> | device		crypto
> |
> | # Enabling quota support
> | options		QUOTA
>
>> Are you setting any of the em device's kernel tunables?
>
> Nope. Some more details on the hardware:
>
> | [ccowart@mug /usr/local/rescomp/etc/online-helpdesk]$ dmesg | grep  
> em0
> | em0: <Intel(R) PRO/1000 Network Connection 6.9.6> port  
> 0xece0-0xecff mem 0xfc3e0000-0xfc3fffff,0xfc3c0000-0xfc3dffff irq 16  
> at device 0.0 on pci14
> | em0: Using MSI interrupt
> | em0: [FILTER]
> | em0: Ethernet address: 00:15:17:7a:b5:e0
> | em0: link state changed to UP
>
>> How is your IPSEC stuff built in to the kernel and are you setting  
>> any
>> kernel variables for it?
>
> See the kernel config above. We are not tuning via sysctls or the
> loader.conf. All of that should be stock for IPSEC. The host is
> configured with 2 peers for IPSec communication. We use racoon and  
> x509
> certificates.
>
>> My best guess from the stack trace is a buffer that is not being
>> returned properly somewhere between the device and the IPsec
>> subsystem.
>
> One of my coworkers has been able to reproduce the crash by syslogging
> really long lines. I'm planning to spend some time today to try to
> create a smaller example case on a couple of vms.

Sounds great.  One more thought.  Can you try this without  
IPSEC_FILTERTUNNEL?
Since that copies packets I'm suspicious of it.

Best,
George