From owner-freebsd-current@FreeBSD.ORG Tue Aug 24 21:52:51 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 188B716A4CE for ; Tue, 24 Aug 2004 21:52:51 +0000 (GMT) Received: from anchor-post-34.mail.demon.net (anchor-post-34.mail.demon.net [194.217.242.92]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4522A43D2D for ; Tue, 24 Aug 2004 21:52:46 +0000 (GMT) (envelope-from mike@urgle.com) Received: from guylian.urgle.com ([80.177.40.54]) by anchor-post-34.mail.demon.net with esmtp (Exim 3.35 #1) id 1BzjDf-0004Bw-0Y for freebsd-current@freebsd.org; Tue, 24 Aug 2004 21:52:43 +0000 Received: from mike by guylian.urgle.com with local (Exim 4.32; FreeBSD) id 1BzjDe-0002aZ-Ts for freebsd-current@freebsd.org; Tue, 24 Aug 2004 21:52:42 +0000 Date: Tue, 24 Aug 2004 22:52:42 +0100 From: Mike Bristow To: freebsd-current@freebsd.org Message-ID: <20040824215242.GB8363@urgle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6i Subject: IPv4 checksum oddness (gcc compiler bug?) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Aug 2004 21:52:51 -0000 Hi, I've been suffering from really horrid (~60-70%) packet loss for a while, but only with IPv4. I've spent some time thinking I had a hardware problem, as it started at the same time as changed some networking bits but it doesn't appear to be the case: older (5.2.1) version of FreeBSD don't have this problem. I've just cvsuped to RELENG_5 box (cvsup'ed with tag=RELENG_5 date=2004.08.24.00.00.00), and with the attached patch I see many entries like in my logs: csum calc discrepancy: 45:10:00:64:b0:22:40:00:40:06:98:95:50:b1:28:36:50:b1:28:34 csum calc discrepancy: 45:10:00:84:b0:23:40:00:40:06:98:74:50:b1:28:36:50:b1:28:34 However, my patch is obviously wrong. The only thing that I can think of that might possibly be the cause is a compiler optimization bug - but I'm not sure that that's the case, either. My make.conf is boring (http://www.urgle.com/~mike/make.conf if you want to see its dullness). I can't believe that this is a real problem, rather than an artifact of my stupidity, because if it was a real problem everyone with old PIIs would be screaming the place down. Has anyone any ideas as to how to debug this? The kernel is GENERIC; possibly interesting other facts include: hw.model: Pentium II/Pentium II Xeon/Celeron hw.ncpu: 2 vr0: flags=8843 mtu 1500 inet 80.177.40.52 netmask 0xfffffff0 broadcast 80.177.40.63 inet6 fe80::280:c8ff:feea:8041%vr0 prefixlen 64 scopeid 0x1 inet6 2002:50b1:2836:1:280:c8ff:feea:8041 prefixlen 64 autoconf ether 00:80:c8:ea:80:41 media: Ethernet autoselect (100baseTX ) status: active --- ip_input.c.orig Tue Aug 24 22:24:45 2004 +++ ip_input.c Tue Aug 24 22:24:45 2004 @@ -366,6 +366,14 @@ } else { if (hlen == sizeof(struct ip)) { sum = in_cksum_hdr(ip); + if (sum) { + u_short sumchk; + sumchk = in_cksum(m, hlen); + if (!sumchk) { + printf("csum calc discrepancy: %20D\n", (u_char *)ip, ":"); + sum = 0; + } + } } else { sum = in_cksum(m, hlen); } -- You dont have to be illiterate to use the Internet, but it help's.