Date: Tue, 27 Jan 2009 13:45:56 +0100 From: "Arno J. Klaassen" <arno@heho.snv.jussieu.fr> To: Dmitry Marakasov <amdmi3@amdmi3.ru> Cc: current@freebsd.org Subject: Re: Data corruption with checksum offloading enabled Message-ID: <wphc3kj4sb.fsf@heho.snv.jussieu.fr> In-Reply-To: <20090126144044.GB6054@hades.panopticon> (Dmitry Marakasov's message of "Mon\, 26 Jan 2009 14\:40\:44 %2B0000") References: <20090123221826.GB30982@deprived.panopticon> <20090126144044.GB6054@hades.panopticon>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello, Dmitry Marakasov <amdmi3@amdmi3.ru> writes: > For now I have two cases of corruption - in both cases it is single > difference of one 128 byte block with file offsets 0x65F872 and > 0x61A072. I had a similar problem last April on a 7-stable box reported in a 'nfs-server silent data corruption' thread. I found : - in all failing cases just *one* byte is currupted, 4 or all 8 bits set to zero *and* the original value is one out of the limited subset {1, 8, 9} .... here is the output of `cmp -x $i/BIG $i/BIG2` for some failing cases I saved : 03869a48 09 00 05209d88 09 00 01777148 09 00 00f10f88 09 00 01f4c4c8 11 00 06c3d6c8 11 00 0725ca48 18 00 01608008 09 00 00f3b888 18 00 07aa45c8 29 20 Does your corruption fulfill these characterisations as well? > I was suggested by Andrzej Tobola to try disabling txcsum on a > network interface. I've disabled both rxcsum and txcsum, and that > solved a problem. > > Judging from that this helped Andrzej with sk(4) and me with ale(4) > driver, that's not a single driver problem. Does his mean that we > have global problems with checksum offloading? I could reproduce it with nfe(4) and re(4) ... interestingly enough, I could *not* reproduce it when disabling cpu frequency control ... for what it's worth Best, Arno
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wphc3kj4sb.fsf>