From owner-freebsd-current Wed Dec 15 22: 0:47 1999 Delivered-To: freebsd-current@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 7E28815511 for ; Wed, 15 Dec 1999 22:00:42 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.9.3/8.9.1) id WAA47882; Wed, 15 Dec 1999 22:00:20 -0800 (PST) (envelope-from dillon) Date: Wed, 15 Dec 1999 22:00:20 -0800 (PST) From: Matthew Dillon Message-Id: <199912160600.WAA47882@apollo.backplane.com> To: Andrew Gallatin Cc: freebsd-current@FreeBSD.ORG Subject: Re: Serious server-side NFS problem References: <14423.46117.353932.473968@grasshopper.cs.duke.edu> Sender: owner-freebsd-current@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG Here's a general update on this bug report to -current. It took all day but I was finally able to reproduce Andrew's bug. You guys are going to *love* this. NFS uses the kernel 'boottime' structure to generate its version id. Now normally you might believe that this structure, once set, will never change. The authors of NFS certainly make that assumption! No such luck. If you happen to be running, oh, xntpd for example, the kernel adjusts the boottime structure to take into account time changes, including PLL changes so, in fact, the boottime structure can change quite often - once each tick, in fact. Now, the effect of boottime changing on NFS is rather drastic. You see, the version id controls whether NFS clients must reset their state machines for NFS data writes. If a client has done a stage 1 write and is ready to do the stage 2 commit, and the version id changes, the client must revert back to stage 1. And so Andrews bug report comes into the light! His poor client (and mine once I reproduced the bug) got into a state, due to the server returning a different version id for virtually every packet, where it resent the same write data over the network over and over and over and over and over again. I think recent changes to the way the kernel clocks work in -current brought the bug out into the open, but it's definitely a problem in both -stable and -current. Doh! I gotta say that if I hadn't happened to have been running xntpd on my test box I would have *never* figured it out. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-current" in the body of the message