Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 23 Jan 2004 22:55:08 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        hackers@freebsd.org
Subject:   XL driver checksum producing corrupted but checksum-correct packets
Message-ID:  <200401240655.i0O6t8lp030917@apollo.backplane.com>

next in thread | raw e-mail | index | archive | help
    I tracked down an occassional buildworld failure on DragonFly to my
    XL driver, which is synchronized to 4.x's XL driver.

    What was occuring was that NFS would send an access/lookup RPC and the
    data in the packet would get corrupted by the XL hardware, and the XL
    hardware would apply a valid checksum to the corrupted packet so the NFS
    receiver had no idea that the packet contained corrupted data.  By
    tcpdumping on both the client and the server I observed the client 
    believing it had sent a valid access RPC and the server receiving
    a corrupted access RPC with a valid checksum, then returning an error
    back to the client e.g. like EPROTONOSUPPORT.

    The corruption seemed to occur about one out of every million packets or
    so.  In DFly the corruption was especially detectable doing buildworlds
    with /usr/src mounted via NFSv3/UDP, with /usr/bin/* residented (A DFly
    feature which replaced the prior prelinking code we had with something
    much better, which FreeBSD-5.x might want to adopt since 5.x is using
    dynamic binaries for /bin now).  About once every 3 buildworlds, 
    typicaly mkdep failing with weird open() errors returned by the server
    after it had tried to process the corrupted NFS access/lookup rpc 
    request.  I also observed it with /usr/bin/* not residented but at a
    much lower frequency... once every 10 buildworlds.  I'm not sure why 
    there was a difference.

    Turning off hardware checksums with ifconfig solved the problem, and
    I made it the default for DFly.  I recommend that FreeBSD turn off
    hardware assisted checksums in the XL driver by default too.

    Here is the pciconf -l output for the XL PCI card I am using:

xl0@pci1:6:0:   class=0x020000 card=0x764610b7 chip=0x764610b7 rev=0x30 hdr=0x00

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200401240655.i0O6t8lp030917>