From owner-freebsd-questions Thu Oct 10 8:50: 7 2002 Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A5BD137B401 for ; Thu, 10 Oct 2002 08:50:03 -0700 (PDT) Received: from infinity.aesredfish.net (ns1.aesredfish.net [65.168.0.12]) by mx1.FreeBSD.org (Postfix) with ESMTP id AC6B843EA3 for ; Thu, 10 Oct 2002 08:49:57 -0700 (PDT) (envelope-from wmoran@potentialtech.com) Received: from potentialtech.com (mhope-dhcp-65-168-1-181.dashfast.com [65.168.1.181]) by infinity.aesredfish.net (8.11.6/8.11.0) with ESMTP id g9AFnmW17460; Thu, 10 Oct 2002 11:49:48 -0400 Message-ID: <3DA5A37B.3000706@potentialtech.com> Date: Thu, 10 Oct 2002 11:57:47 -0400 From: Bill Moran User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.0rc1) Gecko/20020502 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Hartmann, O." Cc: freebsd-questions@freebsd.org Subject: Re: NFS server not respnding! References: <20021010163107.P68233-100000@klima.physik.uni-mainz.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-freebsd-questions@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG [I'm taking this off -stable because it really doesn't belong there] Hartmann, O. wrote: > On Thu, 10 Oct 2002, Bill Moran wrote: > > Hello Bill. > > :>Hartmann, O. wrote: > :>> Dear Sirs. > :>> > :>> Using FreeBSD 4.6.2-pl2 and FreeBSD 4.7-RC2 on our server system (one 4.7-RC > :>> experimental system) and utilizing AMD for mounting home space and other > :>> services via TCP protocol results in > :>> > :>> nfs server 134.93.180.216:/usr/homes: not responding > :>> nfs server 134.93.180.216:/usr/homes: is alive again > :>> > :>> very often, when load of the appropriate client is very high. That happens > :>> when on our number crunching systems utilization of CPU time is high or > :>> many users try copy from and to via SAMBA to the main NFS server system. > :> > :>Yup. Happens because either the server or the network is swamped and some > :>NFS packets are not being responded to as quickly as the client expects. > :> > :>Other than being annoying, it's not really a major problem. > > Well, it seems to _be_ a major problem due to breaking copy actions from > Windows clients over SAMBA when NFS server is not respondig. I wasn't aware that this was causing actual problems (I guess I should read more carefully) > I still __use__ TCP for the NFS connections, especially for all AMD mounts. > Since tcp has been choosen as the main protocoll, those problems occur. > I think something has to be changed to make all clients waiting a little bit > for the server. Yes, search the archives. There are knobs you can tweak. > The only real tweak would be to swap over to GigaBit LAN > within our server system room to avoid the traffice bottleneck between > the serving systems (we have a really misconfigured network and it makes > it really hard to deal with a suitable topology - at the moment all traffic > goes twice through our FreeBSD/PicoBSD filtering bridge). Err ... I would fix the underlying network problems pronto. Anything you do to try to work around them is just going to make things worse. Take my advise on this, I have personal experience. I worked for a place a year ago that was working with a flakey wireless WAN. The wireless guy couldn't get the connection reliable (wildly varying ping times and 5% average packet loss) and I was expected to use FreeBSD to "make up" for these failures. To FreeBSD's (and the rest of the open-source workforce) credit, I was able to do a lot toward hiding the wireless problems, with a combination of VPN compression and a number of scripts that raised/lowered interfaces when things failed, as well as a other screwball tricks. BUT, the network config became unbelievably complex and difficult to maintain, and I was unable to ever get rid of the problems entirely. Obviously, the correct solution was to fix the wireless problems, but that would have cost $$$ (never mind the fact that they paid me tons to constantly play around with the routers) > > Thanks. > :> > :>> This happens under heavy load and, when only a few users are on the systems, > :>> but it happens very often while > :>> > :>> - copying big files/datas from PC systems via SAMBA > :>> - whenever a number crunching job runs on a different server > :>> and on another server a job for copying data has been started, > :>> the influences to a completely different system, in this case the main > :>> NFS server, is significant. > :>> > :>> FreeBSD offers a lot of kernel stuff tunig the system's performance, > :>> especially for NFS etc (also sysctl changeable kernel varibales). > :>> Can anyone help with tuned parameters or give hints how to > :>> investigate problems? > :> > :>Search the mailing list archives. This was discussed some months back, > :>and someone provided info on which knobs to tweak to make the messages > :>go away, along with the possible pitfalls of tweaking those knobs. > :> > :>> What's about the fact running AMD/NFS over TCP instead of UDP? UDP > :>> seems to give the benefit of speed, while TCP seems to be more > :>> reliable and secure from the point of view from the network. > :> > :>I don't think switching to TCP will stop this. To my knowledge, TCP > :>connections only improve reliability over sketchy connections (such > :>as WANs) My experience with NFS/TCP has been that it doesn't really > :>improve reliability that much either (although we were dealing with > :>an _extremely_ flakey wireless WAN - nothing was reliable) > :> > :>-- > :>Bill Moran > :>Potential Technologies > :>http://www.potentialtech.com > :> > :> > > -- > MfG > O. Hartmann > > ohartman@klima.physik.uni-mainz.de > ------------------------------------------------------------------ > IT-Administration des Institutes fuer Physik der Atmosphaere (IPA) > ------------------------------------------------------------------ > Johannes Gutenberg Universitaet Mainz > Becherweg 21 > 55099 Mainz > > Tel: +496131/3924662 (Maschinenraum) > Tel: +496131/3924144 (Buero) > FAX: +496131/3923532 > > > -- Bill Moran Potential Technologies http://www.potentialtech.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-questions" in the body of the message