Date: Mon, 15 Dec 2014 16:29:01 -0800 From: Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com> To: Rick Macklem <rmacklem@uoguelph.ca> Cc: freebsd-net@freebsd.org, =?UTF-8?B?R2Vycml0IEvDvGhu?= <gerrit.kuehn@aei.mpg.de> Subject: Re: compiling on nfs directories Message-ID: <CAOgwaMtky7a62tn3Q%2BvsWZObM9NDVE-tR4iqvxqaLSvxTKrWkQ@mail.gmail.com> In-Reply-To: <306385509.13293746.1418686931068.JavaMail.root@uoguelph.ca> References: <CAOgwaMvAN1zLhZYSuPxa5n8V9=QAj9Y7hmsmCYSYwqjy0fg-6w@mail.gmail.com> <306385509.13293746.1418686931068.JavaMail.root@uoguelph.ca>
next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Dec 15, 2014 at 3:42 PM, Rick Macklem <rmacklem@uoguelph.ca> wrote: > > Mehmet Erol Sanilturk wrote: > > On Mon, Dec 15, 2014 at 12:59 PM, Rick Macklem <rmacklem@uoguelph.ca> > > wrote: > > > > > > Mehmet Erol Sanliturk wrote: > > > > On Mon, Dec 15, 2014 at 1:24 AM, Gerrit K=C3=BChn > > > > <gerrit.kuehn@aei.mpg.de> > > > > wrote: > > > > > > > > > > Hi all, > > > > > > > > > > I ran into some weird issue here last week: > > > > > I have an NFS-Server for storage and diskless booting (pxe / > > > > > nfs > > > > > root) > > > > > running under FreeBSD. The clients are running Gentoo Linux. > > > > > Some > > > > > time > > > > > ago, I replaced the server, going from a HDD-based storage > > > > > array > > > > > (ZFS) > > > > > under FreeBSD 8.3 to an SSD-based array under FreeBSD 10-stable > > > > > (as > > > > > of > > > > > February this year - I know this needs updates). > > > > > > > > > > Only now I recognized that this somehow appears to have broken > > > > > some > > > > > of my > > > > > Gentoo ebuilds that do not install cleanly anymore. They > > > > > complain > > > > > about > > > > > "soiled libtool library files found" and "insecure RUNPATHs" in > > > > > the > > > > > installation stage of shared libs. > > > > > > > > > > I was not able to find any useful solution for this in the Net > > > > > so > > > > > far. > > > > > However, I was able to verify that this is somehow an issue > > > > > with > > > > > the nfs > > > > > server by plugging in a USB-drive into the diskless clients and > > > > > mounting > > > > > this as /var/tmp/portage (the directory structure where > > > > > Gentoo's > > > > > ebuilds > > > > > are compiled). This makes the error messages go away, and > > > > > everything works > > > > > again (like it did before the server update). > > > > > > > > > > Are there any suggestions what might be causing this and how to > > > > > fix > > > > > it? > > > > > > > > > > > > > > > cu > > > > > Gerrit > > > > > > > > > > > > > > > > > > > > > > With respect to information given in your message , may pure > > > > guess is > > > > the > > > > following : > > > > > > > > > > > > When a client generates a file in NFS server , it assumes that > > > > everything > > > > is written into the file . > > > > The next step ( reading the generated file ) starts , BUT the > > > > file is > > > > NOT > > > > completely written into disk yet , > > > > therefore it reads an incomplete file which causes errors in the > > > > client . > > > > > > > Well, not exactly. The NFS client chooses whether or not the > > > written > > > data must be committed to stable storage (disk) right away via a > > > flag > > > argument on the write. It is up to the client to keep track of what > > > has > > > been written and if the FILE_STABLE flag wasn't set, must do a > > > separate > > > Commit RPC to force the data to stable storage on the server. > > > It is also up to the NFS client to keep track of the file's size > > > while > > > it is being grown, since the NFS server's size may be smaller until > > > the data gets written to the server. > > > Also, note that he didn't see the problem with FreeBSD8.3, which > > > would > > > have been following the same rules on the server as 10.1. > > > > > > What I suspect might cause this is one of two things: > > > 1 - The modify time of the file is now changing at a time the Linux > > > client doesn't expect, due to changes in ZFS or maybe TOD clock > > > resolution. (At one time, the TOD clock was only at a > > > resolution > > > of 1sec, so the client wouldn't see the modify time change > > > often. > > > I think it is now at a much higher resolution, but would have > > > to > > > look at the code/test to be sure.) > > > 2 - I think you mention this one later in your message, in that the > > > build might be depending on file locking. If this is the case, > > > trying NFSv4, which does better file locking, might fix the > > > problem. > > > > > > Gerrit, I would suggest that you do "nfsstat -m" on the Linux > > > client, > > > to see what the mount options are. The Linux client might be using > > > NFSv4 > > > already. > > > Also, avoid "soft, intr" especially if you are using NFSv4, since > > > these > > > can cause slow server response to result in a failure of a > > > read/write > > > when it shouldn't fail, due to timeout or interruption by a signal. > > > > > > If you could find out more about what causes the specific build > > > failure > > > on the Linux side, that might help. > > > If you can reproduce a build failure quickly/easily, you can > > > capture > > > packets via "tcpdump -s 0 -w <file> host <client-hostname>" on the > > > server and then look at it in wireshark to see what the server is > > > replying > > > when the build failure occurs. (I don't mind looking at a packet > > > trace if > > > it is relatively small, if you email it to me as an attachment.) > > > > > > Good luck with it, rick > > > ps: I am not familiar with the Linux mount options, but if it has > > > stuff like "nocto", you could try those. > > > > > > > In FreeBSD NFS server , there is NOT ( or I could NOT be able to > > > > find > > > > ) a > > > > facility to store written data immediately into disk . > > > > > > > > NFS server is collecting data up to a point ( number of bytes ) > > > > and > > > > then > > > > writing it to disk , during this phase ( whether the NFS server > > > > is > > > > busy or > > > > not ) is not important ) . With this structure , > > > > the tasks which a program writes a small number of bytes to be > > > > read > > > > by > > > > another program can not be > > > > processed by a NFS server only . > > > > > > > > I did not try "locking in NFS server" : If this route is taken , > > > > then > > > > it is > > > > necessary to adjust the clients for such periods to wait that NFS > > > > server > > > > has removed the lock which themselves can continue ( Each such > > > > read > > > > requires a waiting loop without generating an error message about > > > > unavailable data and termination . ) . > > > > > > > > In Linux NFS server , there is an option to immediately write the > > > > received > > > > data into disk . This is improving the above situation > > > > considerable > > > > but not > > > > completely solving the problem ( because during reads of data , > > > > data > > > > in > > > > cache is NOT concatenated to the data in disk ) . > > > > > > > > > > > > Another MAJOR problem is that , the NFS server is NOT > > > > concatenating > > > > data in > > > > cache to data in disk during reads : This defect is making NFS > > > > server > > > > useless for , let's say "real time" , applications used > > > > concurrently > > > > or as > > > > a single one by the clients without using another "Server" within > > > > NFS > > > > server . > > > > > > > > > > > > > > > > In your case , during software builds , a step is using the > > > > previously > > > > generated files : In local disk , writing and reading are > > > > sequential > > > > , in > > > > the sense that written data is found during reading . In NFS > > > > server > > > > this is > > > > not the case . > > > > > > > > > > > > With respect to my knowledge obtained from messages in FreeBSD > > > > mailing > > > > lists about making a possibility to read data immediately after > > > > it is > > > > written into NFS server is NOT available . > > > > > > > > > > > > > > > > Thank you very much . > > > > > > > > Mehmet Erol Sanliturk > > > > _______________________________________________ > > > > > > > > > > > > > > > > When a C program is written to be used in an NFS environment , some > > possibilities may be used to synchronize write and reads from the > > programs > > with the unsolved "cached data" problem . > > > > When ready programs are used , such as "make" , "ld" , there is no > > choice . > > > > I am using Pascal programs , then there is no such facilities . > > > > The solution may be to improve the NFS Server and Client modules to > > use > > cached data during reads : > > > > If end of file is reached : Before sending EOF signal , check whether > > there > > is data in cache or not . > > If there is data in cache : continue reading from the cache > > up to > > end , > > else send an EOF signal . > > > > ( For random access files , also there is a need to look at the > > cached > > values . ) > > > Well, the FreeBSD NFS client (and most others) do extensive data caching > and will read data from the client cache whenever possible. NFS performan= ce > without client caching is pretty terrible. > > The problem (which has existed since NFS was first developed in about 198= 5) > is that NFS does not provide a cache coherency protocol, so when multiple > clients write data to a file concurrently, there is no guarantee that the > client > read will get the most up-to-date data. There has been something called > close-to-open (cto) consistency adopted, which says that a client will > read data written > by another client after the writing client has closed the file. (Most NFS > clients only implement this "approximately", since they depend on seeing > the > modify time change to determine this. This may not happen when multiple > modifications > occur in the same time of day clock tick or when clients cache the file's > attributes and use a stale cached modify time. Turning off client attribu= te > caching improves this, but also results in a performance hit, due to the > extra Getattr RPCs done.) > > The current consensus within the NFS community (driven by the Linux clien= t > implementation) is to only provide data consistency among multiple client= s > when byte range locking is used on the file. > > I'm not sure if this was what you were referring to. (It is true that NFS > is not and cannot be a POSIX compliant file system, due to it's design.) > > "make" can often be confused when the modify time isn't updated when > expected. > > If an application running on FreeBSD wants to ensure that data is written > to > stable storage on the server, the application can call fsync(2). > > > Since the above modification requires knowledge of internal structure > > of > > NFS Server , and perhaps NFS Client ,I am not able to supply any > > patch . > > Also I am not able to understand its implementation difficulty . > > > > My opinion is that the above modification would be a wonderful > > improvement > > for NFS system in FreeBSD , because it will behave just like a local > > data > > store usable as "real time" data processing tasks . In the present > > structure , this is NOT possible with NFS Client and Server only . > > > Many years ago, I implemented a cache coherency protocol for NFS called > NQNFS. No one used it (at least not much) and it never caught on. > Most care about NFS performance and data coherency has never been a > priority with most users, from what I've seen. > > rick > With respect to given information in The Design and Implementation of the FreeBSD Operating System By Marshall Kirk McKusick, George V. Neville-Neil, Robert N.M. Watson Second Edition : p. 559 NQNFS has been removed from FreeBSD Version 5 on . It was available in Version 4.11 : http://svnweb.freebsd.org/base/release/4.11.0/sys/nfs/nqnfs.h?view=3Dmarkup ( 1 ) Is there a newer version of NQNFS other than the above which is available ? A link would be very good , if it is available . (2) Are there other systems which is using NQNFS in their current distributions ? > > > > > Thank you very much . > > > > > > Mehmet Erol Sanliturk > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMtky7a62tn3Q%2BvsWZObM9NDVE-tR4iqvxqaLSvxTKrWkQ>