Date: Sat, 24 Jan 2004 22:10:11 -0500 (EST) From: Robert Watson <rwatson@freebsd.org> To: Matthew Dillon <dillon@apollo.backplane.com> Cc: hackers@freebsd.org Subject: Re: XL driver checksum producing corrupted but checksum-correct packets Message-ID: <Pine.NEB.3.96L.1040124220715.62871T-100000@fledge.watson.org> In-Reply-To: <200401250302.i0P32BON039881@apollo.backplane.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 24 Jan 2004, Matthew Dillon wrote: > Well, I tried to tcpdump a session. I managed to hit the error three > times but in all three cases the tcpdump on the server dropped the > particular packet I was looking for. I'm only able to get a 70% > retention rate in the tcpdump output on the server... its just trying > to record too much for the machine to handle at the rate the NFS requests > are coming in. To pick up the corrupted packet on the machine where the corruption is occurring, you might want to try hooking up the UDP checksum drop case to BPF_MTAP() for a special BPF device or rule, or have it spit them into a raw socket (probably easier). Problem is, the context switching does in BPF, so if you can get another machine onto the segment without it being excessively switched (perhaps on a monitor port), using a third machine to grab the on-the-wire packets might work best. That way you can compare pre-corruption and post-corruption. > I'm going to give up trying to characterize the corruption for now. > It could very well be the PCI latency timer as previously discussed > but I can't test that right now. If it is the problem, it may be easier to do this and see if it works than to track down the packet :-). good luck... Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.96L.1040124220715.62871T-100000>