From owner-freebsd-net@FreeBSD.ORG Thu Sep 10 07:37:40 2009 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 63E61106566C for ; Thu, 10 Sep 2009 07:37:40 +0000 (UTC) (envelope-from ccowart@rescomp.berkeley.edu) Received: from hal.rescomp.berkeley.edu (hal.Rescomp.Berkeley.EDU [169.229.70.150]) by mx1.freebsd.org (Postfix) with ESMTP id 479968FC08 for ; Thu, 10 Sep 2009 07:37:40 +0000 (UTC) Received: by hal.rescomp.berkeley.edu (Postfix, from userid 1225) id 9C8C4597D73; Thu, 10 Sep 2009 00:37:39 -0700 (PDT) Date: Thu, 10 Sep 2009 00:37:39 -0700 From: Chris Cowart To: George Neville-Neil Message-ID: <20090910073739.GB37291@hal.rescomp.berkeley.edu> Mail-Followup-To: George Neville-Neil , freebsd-net@freebsd.org References: <20090904223123.GD16213@hal.rescomp.berkeley.edu> <723505E9-96C6-401C-A844-3D9BA2033795@neville-neil.com> <20090907191001.GA37291@hal.rescomp.berkeley.edu> <54FDC10A-EAE3-4AE2-BF36-2C5F7D141C3A@neville-neil.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-ripemd160; protocol="application/pgp-signature"; boundary="FsscpQKzF/jJk6ya" Content-Disposition: inline In-Reply-To: <54FDC10A-EAE3-4AE2-BF36-2C5F7D141C3A@neville-neil.com> Organization: RSSP-IT, UC Berkeley User-Agent: Mutt/1.5.20 (2009-06-14) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-net@freebsd.org Subject: IPSEC + long UDP causes reproducible crash [was: Crash in ether_input] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Sep 2009 07:37:40 -0000 --FsscpQKzF/jJk6ya Content-Type: multipart/mixed; boundary="tsOsTdHNUZQcU9Ye" Content-Disposition: inline --tsOsTdHNUZQcU9Ye Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable George Neville-Neil wrote: > Sounds great. One more thought. Can you try this without =20 > IPSEC_FILTERTUNNEL? > Since that copies packets I'm suspicious of it. Hi, Thanks for the suggestion. Disabling IPSEC_FILTERTUNNEL does not prevent the crashes. I have been using i386 and amd64 virtual machines as well as an amd64 physical machine; this problem can be reproduced fairly reliably on all of them for 7.0 and 7.1 (and we're pretty sure we saw it in 6.x and didn't know what it was at the time). I've pared down the set of steps required to reproduce. Kernel config: | include GENERIC | ident IPSEC | options IPSEC | device crypto /etc/rc.conf: | ipsec_enable=3D"YES" | ipsec_file=3D"/etc/ipsec.conf" em0 is configured for DHCP and running on a 1500 mtu network. /etc/ipsec.conf: | add 10.1.10.234 10.1.10.235 esp 12345 -E 3des-cbc | 0x123456789012345678901234567890123456789012345678; | add 10.1.10.234 10.1.10.235 ah 22345 -A hmac-md5 | 0x12345678901234567890123456789012; |=20 | spdadd 10.1.10.234/32 10.1.10.235/32 any -P out ipsec | esp/transport//require ah/transport//require; This is a minimal IPSEC configuration to cause outbound traffic to that IP be passed through IPSEC. You don't even need to configure the other endpoint to test the crash. Earlier today, I was able to cause a crash using just esp and using just ah. Either one alone or both together exhibit the crashes. A C program that sends long UDP messages is attached (there's a hardcoded remote IP in there). The program sends 2 UDP message of size 1960, sleeping for 3 seconds in between. Most of the time, on a clean boot, the first message is enough to cause a kernel panic. The second message almost always causes a kernel panic. I have never been able to run the program a second time without the system crashing. The exact point of the panic tends to vary. I've seen it frequently occurring in in_cksumdata, but it's all been really close to ip_output. I've been poking around in the debugger for hours over the past couple of days. I can't tell if the mbuf is being corrupted as it's passing through the crypto system or if it's happening in ip_fragment. I'm in a bit over my head in terms of trying to isolate and patch the bug. If anyone has the time to squash it or at least give me some pointers as to where I might look, that would help. --=20 Chris Cowart Network Technical Lead Network & Infrastructure Services, RSSP-IT UC Berkeley --tsOsTdHNUZQcU9Ye-- --FsscpQKzF/jJk6ya Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (FreeBSD) iQIcBAEBAwAGBQJKqKzDAAoJEC8b9sM8ejXt9K0QAJQu0qH6HTiNeHU4OggEqM5N /xYlo2yqTWjTF1OifgLQo8Dszw+vgqLAqatsqrZsTSBeBD2t2Xoz5UsQOGe85D+N M74PM1JY9f6724UVhgo+FH6KXcrQWfdsj7kS9Yq/HPv5z8+djlrnTU0kByqP4Q18 qv05nfajzAZMhsAjvV5CX9yJc9YzfSG/PWHp0VLMZq0vQvx4AZZWaK7gfhpLa+G5 pIY15Lamk5C7e0qvpSboNmaINnL3gk2Xq/++gxkFbsKhZKqH2YukUqRAB0ryCM+A vZ56NTfs9B4F0qmNstGzaMKVLWOMdgp2+ZqNWAyvopqAqcu1kp7lyoE+LdoMEDWX dOxxCXe9V0IEDES5yfKJjuy0RHnqLNjwIz3vHNGPLVkjMsi+ejj3X/1EnTPevp3r SsgInB2ZpZtDTURGJ1JTSNc8iqF+menJawBSII8T2vcING+DFcqwf/4lG+jUG2iQ ur57M/z+EnIanyh1bm1ppILAuAxNEJ0GOR+ZuEj7Ap1UW96VA3z/z9/5+ESj4gfD KkygvRp6je1bgYNOHbX9+GgLcR43jQXkiHZtlwNry8NK9L5elJc63Ysx6yNdlYEL Gt+cGmlgV1OF6hseH9qEraozXen/OCX8EPWf6b6TXgpr26ACvs9xs25D9VnCiE/v E12szXGOrDvgAkGkY7AT =I46C -----END PGP SIGNATURE----- --FsscpQKzF/jJk6ya--