From owner-freebsd-hackers Fri Jun 8 2: 5:26 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by hub.freebsd.org (Postfix) with ESMTP id B30AC37B401 for ; Fri, 8 Jun 2001 02:05:23 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.245.138.245.Dial1.SanJose1.Level3.net [209.245.138.245]) by snipe.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id CAA01540; Fri, 8 Jun 2001 02:05:14 -0700 (PDT) Message-ID: <3B209567.1AE09631@mindspring.com> Date: Fri, 08 Jun 2001 02:05:43 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Ian Dowse Cc: Graham Barr , Alfred Perlstein , freebsd-hackers@FreeBSD.ORG Subject: Re: read(2) and ETIMEDOUT References: <200106072116.aa63698@salmon.maths.tcd.ie> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Ian Dowse wrote: > > In message <20010607201846.E50444@pobox.com>, Graham Barr writes: > > >Also why does this happen only every few hours ? There is a lot of > >data going through these connections maybe the timer for SO_RCVTIMEO > >is not being reset. > > > >But then we have another server, with a similar number of clients and > >data through put, but it does not suffer from this problem. > > I suspect that the server seeing this problem has a client that > occasionally disappears from the network, or for whatever reason > fails to respond to any packets for a long time (something like 5 > or 10 minutes). I've seen blocking TCP writes return ETIMEDOUT when > the network between the client and the server goes down. In the > non-blocking case I think the following can happen: I believe the proxy ARP normally sent on an interface coming up can have this effect in the case a client goes down, and someone else gets their DHCP lease. You don't often see this on FreeBSD clients after 4.1, since the gratuitous proxy ARP became broken around then (if you change your IP address, it won't send the ARP unless you down the interface first and bring it back up, and it caches bad clone routes, too, just to make your life miserable). Probably your lease expiration times are set too low. This is usually the case in networks where people have transient connections for things like mobile users, and have exhaused their IP address space, and are trying to conserve it by using much shorter leases. A good, real fix for this is to have incredibly long lease lifetimes (basically, the DHCP server hands out the lease, and if the computer comes back days later, it gets the same lease). For this to work, you are probably going to have to make the local DHCP server give out 10.x addresses, and then NAT the 10.x net for real Internet connectivity. Alternately, it could be something completely different. 8-). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message