Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 11 Jan 2012 20:58:39 -0500
From:      Martin Cracauer <cracauer@cons.org>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        Martin Cracauer <cracauer@cons.org>, freebsd-current@freebsd.org, Stefan Bethke <stb@lassitu.de>
Subject:   Re: Data corruption over NFS in -current
Message-ID:  <20120112015839.GA23012@cons.org>
In-Reply-To: <2072420569.94661.1326332545279.JavaMail.root@erie.cs.uoguelph.ca>
References:  <20120111182110.GA75991@cons.org> <2072420569.94661.1326332545279.JavaMail.root@erie.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
Rick Macklem wrote on Wed, Jan 11, 2012 at 08:42:25PM -0500: 
> Martin Cracauer wrote:
> > Stefan Bethke wrote on Wed, Jan 11, 2012 at 07:14:44PM +0100:
> > > Am 11.01.2012 um 17:57 schrieb Martin Cracauer:
> > >
> > > > I'm sorry for the unspecific bug report but I thought a heads-up
> > > > is
> > > > better than none.
> > > >
> > > > $ uname -a
> > > > FreeBSD wings.cons.org 10.0-CURRENT FreeBSD 10.0-CURRENT #2: Wed
> > > > Dec
> > > > 28 12:19:21 EST 2011
> > > > cracauer@wings.cons.org:/usr/src/sys/amd64/compile/WINGS amd64
> > >
> > > I'm sure Rick will want to know which NFS version, which client code
> > > (default new code I'm assuming) and which mount options...
> > 
> > It's all default both in fstab and as reported by mount(8).
> > 
> I assume that by the above statement, you mean that you don't specify any
> mount options in your /etc/fstab entry except "rw"? (If this isn't correct,
> please post your /etc/fstab entries for the NFS mounts.)

172.18.30.2:/home/diskless/freebsd-current-usr  /usr    nfs     rw 0 0
172.18.30.2:/home/diskless/usr-ports    /usr/ports      nfs     rw 0 0

> - If I am correct, in that you just specify "rw", the main difference
>   between the old and new NFS client will be the rsize/wsize used. The
>   new NFS client will use MAX_BSIZE (64Kb) decreased to whatever the
>   server says is the largest it can handle. This should be fine, unless
>   the server says it can handle >= 64Kb, but actually only works correctly
>   for 32Kb (which is what the old NFS client will default to, I think?).

I'll try 32 KB.

> A few things to try/check:
> - Look locally on the server to see if the file is corrupted there.

Yes it has the corrupted version of the file, and in a new run I had
another file changed to root ownership and that is the same from
server and client standpoint.

The good news is that this seems fairly reproducible, the root
ownership is back.  This time I stopped the script when ownership
changed so I don't know whether it would have gone forward with
corrupting the file afterwards.

> - Try the old NFS client. (Set the fs type to "oldnfs" instead of "nfs"
>   on the lines in your /etc/fstab.)
>   - If switching to the old client helps, it might be a bug in the way the
>     new client generates the create verifier. I just looked at the code and
>     I'm not certain the code in the new client would work correctly for a
>     amd64. (I only have i386 to test with.)
>     - I can easily generate a patch that changes the new client to do this
>       the same way as the old client, but there is no point, unless the old
>       client doesn't have the problem.
>     --> Exclusive create problems might explain the incorrect ownership,
>         since it first does a create that will fill in user/group in whatever
>         default way the Linux server chooses to and then does a Setattr RPC
>         to change them to the correct values. If the Setattr RPC fails, then
>         the file exists owned by whatever the server chooses. (I don't know
>         if Linux servers use the gid of the directory or the gid of the
>         requestor or ???)
> - If you have a non-Linux NFS server, try running against that to see if it
>   is a Linux server specific problem. (Since I haven't seen any other reports
>   like this, I suspect it might be an interoperability problem related to the
>   Linux server.)

I should mention that I also updated the server to Linux-3.1.5 two
weeks ago.  I'm not sure I put I put heavy load on it since then.

I will have a Linux NFS client do the same thing and try the FreeBSD
things you mention.

> Also, if you can reproduce the problem fairly easily, capture a packet trace via
> # tcpdump -s 0 -w xxx host <server> 
> running on the client (or similar). Then email me "xxx" as an attachment and
> I can look at it in wireshark. (If you choose to look at it in wireshark, I
> would suggest you look for Create RPCs to see if they are Exclusive Creates,
> plus try and see where the data for the corrupt file is written.)
> 
> Even if the capture is pretty large, it should be easy to find the interesting
> part, so long as you know the name of the corrupt file and search for that.

That's probably not practical, we are talking about hammering the NFS
server with several CPU hours worth of parallel activity in a
shellscript but I'll do my best :-)

Martin

> > This is a diskless PXE boot but the mount affected (usr) is not the
> > root filesystem, so this should come in via fstab.
> > 
> > BTW, my /usr/ports is another mount so the corruption is cross-mount
> > (garbage from /usr/ports entering /usr).
> > 
> > Appending nfsstat output.
> > 
> nfsstat output is pretty useless for this kind of situation. I did find
> it interesting that you do so many Fsstat RPCs, but that shouldn't be
> a problem, it's just weird to see that.
> 
> rick
> > I am re-running things contiguously to see how reproducible this is.
> > This machine was recently updated from a -current almost a year old,
> > so it's its first time with the new NFS client code.
> > 
> > Martin
> > 
> > > > I see filesystem corruption on NFS filesystems here. I am running
> > > > a
> > > > heavy shellscript that is noodling around with ascii files
> > > > assembling
> > > > them with awk and whatnot. Some actions are concurrent with up to
> > > > 21
> > > > forks doing full-CPU load scripting. This machine is a K8 with a
> > > > total of 8 cores, diskless NFS and memory filesystem for /tmp.
> > > >
> > > > I observe two problems:
> > > > - for no reason whatsoever, some files change from my
> > > >  (user/group) cracauer/wheel to root/cracauer
> > > > - the same files will later be corrupted. The beginning of the
> > > > file
> > > >  is normal but then it has what looks like parts of /usr/ports,
> > > >  including our CVS files and binary junk, mostly zeros
> > > >
> > > > I did do some ports building lately but not at the same time that
> > > > this
> > > > problem manifested itself. I speculate some ports blocks were
> > > > still
> > > > resident in the filesystem buffer cache.
> > > >
> > > > Server is Linux.
> > > >
> > > > Martin
> > > > --
> > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > > > Martin Cracauer <cracauer@cons.org> http://www.cons.org/cracauer/
> > > > _______________________________________________
> > > > freebsd-current@freebsd.org mailing list
> > > > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > > To unsubscribe, send any mail to
> > > > "freebsd-current-unsubscribe@freebsd.org"
> > >
> > > --
> > > Stefan Bethke <stb@lassitu.de> Fon +49 151 14070811
> > >
> > >
> > >
> > 
> > --
> > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
> > Martin Cracauer <cracauer@cons.org> http://www.cons.org/cracauer/
> > 
> > _______________________________________________
> > freebsd-current@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to
> > "freebsd-current-unsubscribe@freebsd.org"

-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer@cons.org>   http://www.cons.org/cracauer/



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20120112015839.GA23012>