Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 16 Dec 2014 08:27:56 -0500 (EST)
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        Gerrit =?utf-8?B?S8O8aG4=?= <gerrit.kuehn@aei.mpg.de>
Cc:        freebsd-net@freebsd.org
Subject:   Re: compiling on nfs directories
Message-ID:  <1305365048.13521399.1418736476769.JavaMail.root@uoguelph.ca>
In-Reply-To: <20141216092948.605dc8e2e0fec3fa4a5f8ec1@aei.mpg.de>

next in thread | previous in thread | raw e-mail | index | archive | help
Gerrit Kuhn wrote:
> On Mon, 15 Dec 2014 15:59:29 -0500 (EST) Rick Macklem
> <rmacklem@uoguelph.ca> wrote about Re: compiling on nfs directories:
> 
> 
> RM> Also, note that he didn't see the problem with FreeBSD8.3, which
> would
> RM> have been following the same rules on the server as 10.1.
> RM>
> RM> What I suspect might cause this is one of two things:
> RM> 1 - The modify time of the file is now changing at a time the
> Linux
> RM>     client doesn't expect, due to changes in ZFS or maybe TOD
> clock
> RM>     resolution. (At one time, the TOD clock was only at a
> resolution
> RM>     of 1sec, so the client wouldn't see the modify time change
> often.
> RM>     I think it is now at a much higher resolution, but would have
> to
> RM>     look at the code/test to be sure.)
> RM> 2 - I think you mention this one later in your message, in that
> the
> RM>     build might be depending on file locking. If this is the
> case,
> RM>     trying NFSv4, which does better file locking, might fix the
> RM>     problem.
> 
> Meanwhile I have googled around a bit more, and one of the few
> reasons
> other people see the error messages I see appears to be a broken
> clock that
> makes "make" recompile stuff on the installation stage. As I was
> already
> wondering why compilation took longer than I had actually expected, I
> may
> be seeing something similar (still need to look into that), although
> my
> clock is fine (but time stamps on the NFS might be messed up somehow
> like
> you mention above under "1").
> 
> RM> Gerrit, I would suggest that you do "nfsstat -m" on the Linux
> client,
> RM> to see what the mount options are. The Linux client might be
> using
> RM> NFSv4 already.
> 
> This is what it says about my nfs-root:
> 
> ---
> pt-nds ~ # nfsstat -m
> / from 192.168.32.253:/tank/diskless/nds
>  Flags:
>  rw,relatime,vers=3,rsize=4096,wsize=4096,namlen=255,hard,nolock,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.32.253,mountvers=3,mountproto=tcp,local_lock=all,addr=192.168.32.253
> ---
This looks fine to me, although the rsize, wsize are small.
(This should only affect performance, not cause any build issues.)

> 
> This is what I set up for pxe-booting:
> 
> ---
> label gentoo-cs2
>   menu label linux-3.8.13-gentoo-2
>   kernel bzImage-3.8.13-gentoo-2
>   append ip=dhcp root=/dev/nfs rw
>   nfsroot=192.168.32.253:/tank/diskless/nds,nolock,tcp,v3
>   rootdelay=15
> ---
> 
> 
> So I definitely run "nfsv3" and "nolock". I remember trying to use
> nfsv4 on the diskless machines some years ago, but back then it was
> not ready for prime time.
> 
Yes. You can't use NFSv4 for a root fs (well, maybe Linux can now, but...).
With "nolock" you shouldn't have file locking issues.

> RM> Also, avoid "soft, intr" especially if you are using NFSv4, since
> these
> RM> can cause slow server response to result in a failure of a
> read/write
> RM> when it shouldn't fail, due to timeout or interruption by a
> signal.
> 
> There is "hard" in there as a default option. However, I might try
> turning on locking (I regarded it as superfluous up to now as I have
> only one client using the filesystem).
> 
I think "nolock" (which does the locking locally in the client) will
work better that rpc.lockd.

> RM> If you could find out more about what causes the specific build
> failure
> RM> on the Linux side, that might help.
> 
> As I said above, I have some hints that indicate something might be
> wrong with timestamps, but I still need to dig deeper into that.
> 
> RM> If you can reproduce a build failure quickly/easily, you can
> capture
> RM> packets via "tcpdump -s 0 -w <file> host <client-hostname>" on
> the
> RM> server and then look at it in wireshark to see what the server is
> RM> replying when the build failure occurs. (I don't mind looking at
> a
> RM> packet trace if it is relatively small, if you email it to me as
> an
> RM> attachment.)
> 
> I can reproduce it 100%, but it only happens on the installation
> stage, after having compiled the whole stuff. So I don't know if I
> will be able to produce a dump of reasonable size that contains the
> issue, but I'll try.
> 
> RM> ps: I am not familiar with the Linux mount options, but if it has
> RM>     stuff like "nocto", you could try those.
> 
> The manpage has the following:
> 
> ---
>        cto / nocto    Selects  whether  to  use  close-to-open cache
>        coherence
>                       semantics.  If neither option is specified (or
>                       if cto is
>                       specified),  the  client uses close-to-open
>                       cache coher-
>                       ence semantics. If the nocto option  is
>                        specified,  the
>                       client  uses  a non-standard heuristic to
>                       determine when
>                       files on the server have changed.
> 
>                       Using the nocto option may improve performance
>                       for read-
>                       only  mounts, but should be used only if the
>                       data on the
>                       server changes only occasionally.  The DATA AND
>                       METADATA
>                       COHERENCE  section discusses the behavior of
>                       this option
>                       in more detail.
> ---
> 
> 
> So "cto" appears to be the default and is probably what is used right
> now. I'll put "nocto" on my list of things to try (although the
> description is not really that incouraging... :-).
> 
Well, "nocto" is meant to improve performance and not correctness, but
it might be worth a try. (For FreeBSD, it would only break the multiple
clients modifying the file case. For Linux??)

You could also try turning off client side attribute caching. I think
the mount options for this will be something like "acregmin=0,acregmax=0".
(This could cause a performance hit, but will force the client to acquire
attributes, including modify time, from the server more frequently.)

I'm not a ZFS guy, but I thought there was a recent ZFS patch related
to updating a time attribute, but I can't remember if it was atime or mtime?
(You might try a post to freebsd-current@ asking about ZFS time attributes.)

Good luck with it, rick

> 
> cu
>   Gerrit
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to
> "freebsd-net-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1305365048.13521399.1418736476769.JavaMail.root>