From owner-freebsd-net@FreeBSD.ORG Wed Feb 7 18:50:40 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5D3D216A40A; Wed, 7 Feb 2007 18:50:40 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.freebsd.org (Postfix) with ESMTP id F10BF13C46B; Wed, 7 Feb 2007 18:50:39 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout2.pacific.net.au (Postfix) with ESMTP id C33AD109A7E; Thu, 8 Feb 2007 05:50:35 +1100 (EST) Received: from besplex.bde.org (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id CE8108C03; Thu, 8 Feb 2007 05:50:36 +1100 (EST) Date: Thu, 8 Feb 2007 05:50:35 +1100 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Oleg Bulyzhin In-Reply-To: <20070207193426.P35180@besplex.bde.org> Message-ID: <20070208043101.K687@besplex.bde.org> References: <20070125170532.c9c2374hkwk4oc4k@server.yirdis.net> <20070205232800.GA45487@lath.rinet.ru> <20070207003539.I31484@besplex.bde.org> <20070206221857.GA66675@lath.rinet.ru> <20070207193426.P35180@besplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Robin Gruyters , freebsd-net@freebsd.org Subject: Re: [Fwd: Re: bge Ierr rate increase from 5.3R -> 6.1R] X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Feb 2007 18:50:40 -0000 On Wed, 7 Feb 2007, Bruce Evans wrote: > On Wed, 7 Feb 2007, Oleg Bulyzhin wrote: > >> On Wed, Feb 07, 2007 at 01:31:39AM +1100, Bruce Evans wrote: >>> I use jdp's quicker fix. It works fine for detecting cable unplug >>> and replug, but link detection is still very bad at boot time and I've now tested your fix in 6.2. It works to fix the few dropped packets every second, like jdp's fix does in -current. I didn't notice any other effect. >>> after down/up (seems to be worse for down/up than unplug/replug?). >>> Link detection in -current generally seems to be much worse than >>> in 5.2.On some systems I use two ping -c2's early in the boot to >>> wait for the link to actually be up. The first ping tends to fail >>> and the second tends to work, both long after the lonk claims to >>> be up. Then other network activity still takes too long to start. >>> Without the pings, an "ntpdate -b" early in the boot fails about >>> half the time and gives messed up timing activity when it fails, >>> and initial nfs mounts tak 30-60 seconds. Later after down/up and >>> waiting for the "up" message, ttcp -u usually fails to connect the >>> first time and then works normally with no failure or connection >>> delay the second time. >> >> Could you please give some more details? I'm intersted in: Further testing seems to show that the problem has little or nothing to do with link detection or the FreeBSD version, but is related to route expiry and the possibly the driver's interaction with this. Both of the following give similar bad behaviour on a simple network with static routes: (1) ifconfig $interface down up # bge, sk, xl, fxp all have the problem # but it seems to be much smaller for # fxp sleep 1 # longer doesn't help unless the other # side establishes the route netstat -r # shows empty expire time route get $remotehost # shows negative expire time ping -c1 $remotehost # first ping always fails (unless other # side establishes route beforehand) ping -c1 $anothremotehost # similarly for all other remote hosts netstat -r # still shows empty expire time ping -c1 $remotehost # second ping usually works netstat -r # shows expire time of ~1200 seconds ping -c1 $remotehost # third ping works if second one didn't (2) ifconfig $interface down sh /etc/netstart [rest as above] (3) route delete $remotehost [rest as above, for one host only] (1a-3a) as above, but use "ttcp -nlarge ..." instead of "ping -c1". The first ttcp fails with EHOSTDOWN. Well, that was under FreeBSD-mumble. On retesting (3) under 6.2 with your fix, the behaviour is much better -- the first ping usually works instantly. (3a) still fails on the first ttcp. 6.2 seems to be missing your fix for the time_uptime time warp -- after "route delete #remotehost", the negative expire time is slightly less than the uptime under 6.2, but it apparently should be slightly less than 0 as in ~5.2 and -current. I don't understand the expire times -- are they really supposed to go negative? Back under ~5.2, (3) is still misbehaving. Next I tried "ping -fq -c[1-10]" to try to avoid the delay getting the route established. This didn't work. There was still a delay; for small counts all except the last packet were lost; for large counts, most packets caused EHOSTDOWN. I now remember that debugging of "ttcp -nlarge" showed similar behaviour -- sendto() succeeded for the first few packets after down/up and then EHOSTDOWN was returned and ttcp aborted. The behaviour after booting seems to be the same, but I haven't checked it using netstat etc. yet. Normally there is a more important utility than ping that is the first one to access the network. Network initialization doesn't seem to do anything to ensure that the routes are actually usable immediately, at least in the old version of userland that I use, so it is left to the first utility that uses a route to retry after the first few packets are dropped. My configuration runs either ntpdate or nfs first, and these don't seem to handle the problem well: - ntpdate now usually succeeds after 10-20 seconds, but sometimes it fails after 10-20 seconds. I think it may be retrying once but not twice, so it fails in cases where 3 ping -c1's would have been needed to establish the route. Or maybe 2 retries with success on the last, where 4 pings would have been needed. Under the 6.2 kernel, messages about 2 sendto failures were printed before the success in one successful case. I haven't noticed these messages under -current or ~5.2. - nfs now usually takes 55 seconds to succeed on one client. It used to take < 1 or 2 seconds when the client and server were both fxp (now the client is sk and the server is bge). On another client wth bge and with ping before nfs, nfs takes ony 10-20 seconds to succeed despite the pings being to the ntp server and not the nfs server. >> - is there recipe how to trigger erroneous behaviour? i'll try it on mine >> bge cards. (i have bcm5721 5700 & 5701, but i didnt notice errors in >> link handling with -current driver (and i had problems with 5.x driver)) > > Just boot or down/up. route delete followed by ttcp is easiest. >> The fact 'ping' workaround does help looks like lost interrupt. > > Ah, it could be my returning immediately in bge_intr() if the status > block hasn't been updated. This wouldn't notice changes to the software > link status. It wasn't that. Debugging showed that the early return never happened. Bruce