From owner-freebsd-net@FreeBSD.ORG Tue Dec 16 01:31:20 2014 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B4F0B542 for ; Tue, 16 Dec 2014 01:31:20 +0000 (UTC) Received: from esa-annu.net.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 5AD48B8C for ; Tue, 16 Dec 2014 01:31:19 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AggFAOSKj1SDaFve/2dsb2JhbABag1hYBIMCwl0MhSZKAoE4AQEBAQF9hAwBAQEDAQEBASArIAYFBRYYAgINGQIpAQkmBggHBAETBwIEiAMIDb1qll4BAQEBAQUBAQEBAQEBAQEZgSGIaYNbgSsQAgEGFQEzB4ItOxGBMAWJPoJehSSDHIMgMIRfRoMYgTSCU4M4IoF+HoFuIDABAQEEfUF+AQEB X-IronPort-AV: E=Sophos;i="5.07,583,1413259200"; d="scan'208";a="178550281" Received: from muskoka.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.222]) by esa-annu.net.uoguelph.ca with ESMTP; 15 Dec 2014 20:31:17 -0500 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BC1E3AEA51; Mon, 15 Dec 2014 20:31:17 -0500 (EST) Date: Mon, 15 Dec 2014 20:31:17 -0500 (EST) From: Rick Macklem To: Mehmet Erol Sanliturk Message-ID: <1877801167.13336057.1418693477745.JavaMail.root@uoguelph.ca> In-Reply-To: Subject: Re: compiling on nfs directories MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.203] X-Mailer: Zimbra 7.2.6_GA_2926 (ZimbraWebClient - FF3.0 (Win)/7.2.6_GA_2926) Cc: freebsd-net@freebsd.org, Gerrit =?utf-8?B?S8O8aG4=?= X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Dec 2014 01:31:20 -0000 Mehmet Erol Sanliturk wrote: > On Mon, Dec 15, 2014 at 3:42 PM, Rick Macklem > wrote: > > > > Mehmet Erol Sanilturk wrote: > > > On Mon, Dec 15, 2014 at 12:59 PM, Rick Macklem > > > > > > wrote: > > > > > > > > Mehmet Erol Sanliturk wrote: > > > > > On Mon, Dec 15, 2014 at 1:24 AM, Gerrit K=C3=BChn > > > > > > > > > > wrote: > > > > > > > > > > > > Hi all, > > > > > > > > > > > > I ran into some weird issue here last week: > > > > > > I have an NFS-Server for storage and diskless booting (pxe > > > > > > / > > > > > > nfs > > > > > > root) > > > > > > running under FreeBSD. The clients are running Gentoo > > > > > > Linux. > > > > > > Some > > > > > > time > > > > > > ago, I replaced the server, going from a HDD-based storage > > > > > > array > > > > > > (ZFS) > > > > > > under FreeBSD 8.3 to an SSD-based array under FreeBSD > > > > > > 10-stable > > > > > > (as > > > > > > of > > > > > > February this year - I know this needs updates). > > > > > > > > > > > > Only now I recognized that this somehow appears to have > > > > > > broken > > > > > > some > > > > > > of my > > > > > > Gentoo ebuilds that do not install cleanly anymore. They > > > > > > complain > > > > > > about > > > > > > "soiled libtool library files found" and "insecure > > > > > > RUNPATHs" in > > > > > > the > > > > > > installation stage of shared libs. > > > > > > > > > > > > I was not able to find any useful solution for this in the > > > > > > Net > > > > > > so > > > > > > far. > > > > > > However, I was able to verify that this is somehow an issue > > > > > > with > > > > > > the nfs > > > > > > server by plugging in a USB-drive into the diskless clients > > > > > > and > > > > > > mounting > > > > > > this as /var/tmp/portage (the directory structure where > > > > > > Gentoo's > > > > > > ebuilds > > > > > > are compiled). This makes the error messages go away, and > > > > > > everything works > > > > > > again (like it did before the server update). > > > > > > > > > > > > Are there any suggestions what might be causing this and > > > > > > how to > > > > > > fix > > > > > > it? > > > > > > > > > > > > > > > > > > cu > > > > > > Gerrit > > > > > > > > > > > > > > > > > > > > > > > > > > > With respect to information given in your message , may pure > > > > > guess is > > > > > the > > > > > following : > > > > > > > > > > > > > > > When a client generates a file in NFS server , it assumes > > > > > that > > > > > everything > > > > > is written into the file . > > > > > The next step ( reading the generated file ) starts , BUT the > > > > > file is > > > > > NOT > > > > > completely written into disk yet , > > > > > therefore it reads an incomplete file which causes errors in > > > > > the > > > > > client . > > > > > > > > > Well, not exactly. The NFS client chooses whether or not the > > > > written > > > > data must be committed to stable storage (disk) right away via > > > > a > > > > flag > > > > argument on the write. It is up to the client to keep track of > > > > what > > > > has > > > > been written and if the FILE_STABLE flag wasn't set, must do a > > > > separate > > > > Commit RPC to force the data to stable storage on the server. > > > > It is also up to the NFS client to keep track of the file's > > > > size > > > > while > > > > it is being grown, since the NFS server's size may be smaller > > > > until > > > > the data gets written to the server. > > > > Also, note that he didn't see the problem with FreeBSD8.3, > > > > which > > > > would > > > > have been following the same rules on the server as 10.1. > > > > > > > > What I suspect might cause this is one of two things: > > > > 1 - The modify time of the file is now changing at a time the > > > > Linux > > > > client doesn't expect, due to changes in ZFS or maybe TOD > > > > clock > > > > resolution. (At one time, the TOD clock was only at a > > > > resolution > > > > of 1sec, so the client wouldn't see the modify time change > > > > often. > > > > I think it is now at a much higher resolution, but would > > > > have > > > > to > > > > look at the code/test to be sure.) > > > > 2 - I think you mention this one later in your message, in that > > > > the > > > > build might be depending on file locking. If this is the > > > > case, > > > > trying NFSv4, which does better file locking, might fix the > > > > problem. > > > > > > > > Gerrit, I would suggest that you do "nfsstat -m" on the Linux > > > > client, > > > > to see what the mount options are. The Linux client might be > > > > using > > > > NFSv4 > > > > already. > > > > Also, avoid "soft, intr" especially if you are using NFSv4, > > > > since > > > > these > > > > can cause slow server response to result in a failure of a > > > > read/write > > > > when it shouldn't fail, due to timeout or interruption by a > > > > signal. > > > > > > > > If you could find out more about what causes the specific build > > > > failure > > > > on the Linux side, that might help. > > > > If you can reproduce a build failure quickly/easily, you can > > > > capture > > > > packets via "tcpdump -s 0 -w host " on > > > > the > > > > server and then look at it in wireshark to see what the server > > > > is > > > > replying > > > > when the build failure occurs. (I don't mind looking at a > > > > packet > > > > trace if > > > > it is relatively small, if you email it to me as an > > > > attachment.) > > > > > > > > Good luck with it, rick > > > > ps: I am not familiar with the Linux mount options, but if it > > > > has > > > > stuff like "nocto", you could try those. > > > > > > > > > In FreeBSD NFS server , there is NOT ( or I could NOT be able > > > > > to > > > > > find > > > > > ) a > > > > > facility to store written data immediately into disk . > > > > > > > > > > NFS server is collecting data up to a point ( number of bytes > > > > > ) > > > > > and > > > > > then > > > > > writing it to disk , during this phase ( whether the NFS > > > > > server > > > > > is > > > > > busy or > > > > > not ) is not important ) . With this structure , > > > > > the tasks which a program writes a small number of bytes to > > > > > be > > > > > read > > > > > by > > > > > another program can not be > > > > > processed by a NFS server only . > > > > > > > > > > I did not try "locking in NFS server" : If this route is > > > > > taken , > > > > > then > > > > > it is > > > > > necessary to adjust the clients for such periods to wait that > > > > > NFS > > > > > server > > > > > has removed the lock which themselves can continue ( Each > > > > > such > > > > > read > > > > > requires a waiting loop without generating an error message > > > > > about > > > > > unavailable data and termination . ) . > > > > > > > > > > In Linux NFS server , there is an option to immediately write > > > > > the > > > > > received > > > > > data into disk . This is improving the above situation > > > > > considerable > > > > > but not > > > > > completely solving the problem ( because during reads of data > > > > > , > > > > > data > > > > > in > > > > > cache is NOT concatenated to the data in disk ) . > > > > > > > > > > > > > > > Another MAJOR problem is that , the NFS server is NOT > > > > > concatenating > > > > > data in > > > > > cache to data in disk during reads : This defect is making > > > > > NFS > > > > > server > > > > > useless for , let's say "real time" , applications used > > > > > concurrently > > > > > or as > > > > > a single one by the clients without using another "Server" > > > > > within > > > > > NFS > > > > > server . > > > > > > > > > > > > > > > > > > > > In your case , during software builds , a step is using the > > > > > previously > > > > > generated files : In local disk , writing and reading are > > > > > sequential > > > > > , in > > > > > the sense that written data is found during reading . In NFS > > > > > server > > > > > this is > > > > > not the case . > > > > > > > > > > > > > > > With respect to my knowledge obtained from messages in > > > > > FreeBSD > > > > > mailing > > > > > lists about making a possibility to read data immediately > > > > > after > > > > > it is > > > > > written into NFS server is NOT available . > > > > > > > > > > > > > > > > > > > > Thank you very much . > > > > > > > > > > Mehmet Erol Sanliturk > > > > > _______________________________________________ > > > > > > > > > > > > > > > > > > > > > > > When a C program is written to be used in an NFS environment , > > > some > > > possibilities may be used to synchronize write and reads from the > > > programs > > > with the unsolved "cached data" problem . > > > > > > When ready programs are used , such as "make" , "ld" , there is > > > no > > > choice . > > > > > > I am using Pascal programs , then there is no such facilities . > > > > > > The solution may be to improve the NFS Server and Client modules > > > to > > > use > > > cached data during reads : > > > > > > If end of file is reached : Before sending EOF signal , check > > > whether > > > there > > > is data in cache or not . > > > If there is data in cache : continue reading from the > > > cache > > > up to > > > end , > > > else send an EOF signal . > > > > > > ( For random access files , also there is a need to look at the > > > cached > > > values . ) > > > > > Well, the FreeBSD NFS client (and most others) do extensive data > > caching > > and will read data from the client cache whenever possible. NFS > > performance > > without client caching is pretty terrible. > > > > The problem (which has existed since NFS was first developed in > > about 1985) > > is that NFS does not provide a cache coherency protocol, so when > > multiple > > clients write data to a file concurrently, there is no guarantee > > that the > > client > > read will get the most up-to-date data. There has been something > > called > > close-to-open (cto) consistency adopted, which says that a client > > will > > read data written > > by another client after the writing client has closed the file. > > (Most NFS > > clients only implement this "approximately", since they depend on > > seeing > > the > > modify time change to determine this. This may not happen when > > multiple > > modifications > > occur in the same time of day clock tick or when clients cache the > > file's > > attributes and use a stale cached modify time. Turning off client > > attribute > > caching improves this, but also results in a performance hit, due > > to the > > extra Getattr RPCs done.) > > > > The current consensus within the NFS community (driven by the Linux > > client > > implementation) is to only provide data consistency among multiple > > clients > > when byte range locking is used on the file. > > > > I'm not sure if this was what you were referring to. (It is true > > that NFS > > is not and cannot be a POSIX compliant file system, due to it's > > design.) > > > > "make" can often be confused when the modify time isn't updated > > when > > expected. > > > > If an application running on FreeBSD wants to ensure that data is > > written > > to > > stable storage on the server, the application can call fsync(2). > > > > > Since the above modification requires knowledge of internal > > > structure > > > of > > > NFS Server , and perhaps NFS Client ,I am not able to supply any > > > patch . > > > Also I am not able to understand its implementation difficulty . > > > > > > My opinion is that the above modification would be a wonderful > > > improvement > > > for NFS system in FreeBSD , because it will behave just like a > > > local > > > data > > > store usable as "real time" data processing tasks . In the > > > present > > > structure , this is NOT possible with NFS Client and Server only > > > . > > > > > Many years ago, I implemented a cache coherency protocol for NFS > > called > > NQNFS. No one used it (at least not much) and it never caught on. > > Most care about NFS performance and data coherency has never been a > > priority with most users, from what I've seen. > > > > rick > > >=20 >=20 > With respect to given information in >=20 > The Design and Implementation of the FreeBSD Operating System > By Marshall Kirk McKusick, George V. Neville-Neil, Robert N.M. > Watson >=20 > Second Edition : p. 559 >=20 >=20 > NQNFS has been removed from FreeBSD Version 5 on . >=20 > It was available in Version 4.11 : >=20 > http://svnweb.freebsd.org/base/release/4.11.0/sys/nfs/nqnfs.h?view=3Dmark= up >=20 >=20 > ( 1 ) Is there a newer version of NQNFS other than the above which > is > available ? > A link would be very good , if it is available . >=20 >=20 > (2) Are there other systems which is using NQNFS in their current > distributions ? >=20 Not that I know of. It's long gone dead and buried... rick >=20 >=20 >=20 > > > > > > > > Thank you very much . > > > > > > > > > Mehmet Erol Sanliturk > > > > > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to > "freebsd-net-unsubscribe@freebsd.org"