Skip site navigation (1)Skip section navigation (2)
Date:      21 Apr 2008 01:02:33 +0200
From:      "Arno J. Klaassen" <arno@heho.snv.jussieu.fr>
To:        stable@freebsd.org
Cc:        net@freebsd.org
Subject:   nfs-server silent data corruption
Message-ID:  <wpmyno2kqe.fsf@heho.snv.jussieu.fr>

index | next in thread | raw e-mail


Hello,

I've a strange problem with a box I'm setting up as nfs-server
under 7-stable :

 - tyan S2895 MB, 2*285Dualcore Opteron, 4G-ECC, ahd-scsi, nfe-network
 - stripped GENERIC as kernel
 - sources as of last saturday afternoon (European time)

I removed everything from /boot/loader.conf and /etc/sysctl.conf, still
I get "easily" data corruption when exporting ahd-scsi over nfs
(NB exporting geom_raid5 gives same data corruption)

Testing with the following pseudo code :

  while checksum1 == checksum2 do
   create random file of $1 MBytes
   calculate md5 checksum1
   copy
   calculate md5 checksum2 on copy


Tested on both (as nfs-client) a 6-stable-i386 from a couple of weeks
ago as well as a linux 2.6.15-gentoo-r1 of about two years ago :
within half an hour the copy will be different .... ;(

I played with nfs-options on client side (nfs[23], conn, intr, [udp|tcp],
-r=, -w= ) but none seem to matter.

Start/Stop rpc.lock/sttatd on server/client just provoked some  :

 cp: utimes: BIG2: No such file or directory
 cp: chown: BIG2: Stale NFS file handle
 cp: chmod: BIG2: Stale NFS file handle
 cp: chflags: BIG2: Operation not supported
 cp: BIG2: Stale NFS file handle
 cp: setting permissions for `BIG2': Stale NFS file handle
 cp: closing `BIG2': Stale NFS file handle

[and then the while loop continued ... as if the NFS handle where not
 that stale ..]

Anyway, I'll try to nail this down more (e.g. nfs-write performance
is horrible ... (nfsd falling down to 0% cpu and then after while
'wake up' and be at around 3-6% again))

I didn't stress-test this MB for a while, but last time I did was
with 7-PRELEASE/RC?/CANTremember-exactly-but-close-to-release
and all worked great

I did add 2G ECC to the 2nd CPU since, though I doubt that interferes
with NFS.

Bref, if anyone has a suggestion ???? (I will try downgrade
to RELENG_7_0 iff noone has a new suggestion for RELENG_7, but I'd like
to go forward and test some maybe suspect recent MFC or other 
suggestion)

Thanx in advance,

best, Arno
 


   


help

Want to link to this message? Use this
URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?wpmyno2kqe.fsf>