From owner-freebsd-current Mon Nov 13 12:52:59 1995 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id MAA27462 for current-outgoing; Mon, 13 Nov 1995 12:52:59 -0800 Received: from kitten.mcs.com (Kitten.mcs.com [192.160.127.90]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id MAA27454 for ; Mon, 13 Nov 1995 12:52:57 -0800 Received: from venus.mcs.com (root@Venus.mcs.com [192.160.127.92]) by kitten.mcs.com (8.6.10/8.6.9) with SMTP id OAA19025; Mon, 13 Nov 1995 14:52:56 -0600 Received: by venus.mcs.com (/\==/\ Smail3.1.28.1 #28.5) id ; Mon, 13 Nov 95 14:52 CST Message-Id: Subject: Re: ISP state their FreeBSD concerns To: terry@lambert.org (Terry Lambert) Date: Mon, 13 Nov 1995 14:52:54 -0600 (CST) From: "Karl Denninger, MCSNet" Cc: terry@lambert.org, current@FreeBSD.org In-Reply-To: <199511132041.NAA17961@phaeton.artisoft.com> from "Terry Lambert" at Nov 13, 95 01:41:13 pm X-Mailer: ELM [version 2.4 PL24] Content-Type: text Content-Length: 2195 Sender: owner-current@FreeBSD.org Precedence: bulk > > > Is this the "write with no permission truncate" or what? What is this > > > NFS write problem? > > > > The symptom is that processes block while waiting for writes to complete, > > sometimes for as long as several *minutes*. For a system which is taking > > hundreds of hits per minute, this quickly blows the system sky-high. > > Ah. So the hangs are on the client. > > The NFS protocol definition is that the writes must complete before the > call returns to the user process. Yes, yes, I know. That's not the problem. > There are three ways to solve this problem: You're assuming the server is too slow. Demonstrably NOT TRUE -- a BSDI machine on the same network with 10x the I/O load of the FreeBSD machine does *NOT* exhibit the problem. > 2) Server caching. The operation is returned completed by the > server, which then starts an async event, but no completion > routine is associated with the event. > > This is dangerous for similar reasons of data integrity. > > Server caching is disallowed by the NFS design document unless > the state can be maintained across a server failure. This > means log structuring with log data sotred in non volatile > memory (ala Auspex, etc.) so the transaction may be rolled > forward. > > 3) Increase the number of nfsiod's on the server. This will allow > more concurrent operations to be outstanding at one time. > > This is allowed (encouraged) by the NFS design document. > > Number 3 is well within your control. Number 3 is irrelavent (the number of NFSIODs), as is the setting of NFS_ASYNC in the kernel (which SHOULD make a difference). Again, there is a *bug* here which causes these deadlocks. It is not a server performance issue, and is readily reproducable in a large web server environment such as we have here. -- -- Karl Denninger (karl@MCS.Net)| MCSNet - The Finest Internet Connectivity Modem: [+1 312 248-0900] | (shell, PPP, SLIP, leased) in Chicagoland Voice: [+1 312 803-MCS1] | 7 Chicagoland POPs, ISDN, 28.8, much more Fax: [+1 312 248-9865] | Email to "info@mcs.net" WWW: http://www.mcs.net ISDN - Get it here TODAY! | Home of Chicago's *Three STAR A* Clarinet feed!