From owner-cvs-all@FreeBSD.ORG Sun Dec 24 10:11:25 2006 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 560A316A412; Sun, 24 Dec 2006 10:11:25 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.freebsd.org (Postfix) with ESMTP id E43F713C47E; Sun, 24 Dec 2006 10:11:24 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout2.pacific.net.au (Postfix) with ESMTP id 303126E0F6; Sun, 24 Dec 2006 21:11:22 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 2243B8C06; Sun, 24 Dec 2006 21:11:22 +1100 (EST) Date: Sun, 24 Dec 2006 21:11:21 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Scott Long In-Reply-To: <458E1579.1050907@samsco.org> Message-ID: <20061224204609.I25303@delplex.bde.org> References: <200612201203.kBKC3MhO053666@repoman.freebsd.org> <20061220132631.GH34400@FreeBSD.org> <20061222003115.R16146@delplex.bde.org> <20061223215918.GA33627@lath.rinet.ru> <20061224124016.F24444@delplex.bde.org> <458E1579.1050907@samsco.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org, Gleb Smirnoff , Bruce Evans , Oleg Bulyzhin Subject: Re: cvs commit: src/sys/dev/bge if_bge.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Dec 2006 10:11:25 -0000 On Sun, 24 Dec 2006, Scott Long wrote: > Bruce Evans wrote: >> On Sun, 24 Dec 2006, Oleg Bulyzhin wrote: >>> it's quite unusal) and it is not lock related: >>> 1) bge_start_locked() & bge_encap fills tx ring. >>> 2) during next 5 seconds we do not have packets for transmit (i.e. no >>> bge_start_locked() calls --> no bge_timer refreshing) >>> 3) for any reason (don't ask me how can this happen), chip was unable to >>> send whole tx ring (only part of it). >>> 4) here we have false watchdog - chip is not wedged but bge_watchdog would >>> reset it. >> >> Then it is a true watchdog IMO. Something is very wrong if you can't send >> 512 packets in 5 seconds (or even 1 packet in 5/512 seconds). > > No it's not wrong. You can be under heavy load and be constantly preempted. > Or you could be getting a fed a steady stream of traffic > and have a driver that is smart enough to clean the TX-complete ring > in if_start if it runs out of TX slots. These effects have been > observed in at least the if_em driver. Come on, we want to handle 100's of kpps. Something is very wrong if we cannot handle 100 pps on one interface in one direction. Other interfaces and directions shouldn't be allowed to dominate so much that anthing gets starved. I would agree that a 5 second timeout is too short for 1 Mbps ethernet, iff 1 Mbps NICs had rx rings with 512 entries :-), since 1 Mbps can only handle 82 pps with 1518-byte packets. The timeout was 2 seconds for 10 Mbps ethernet in most drivers in FreeBSD-1 in 1994. Drivers could easily have a bug like cleaning the tx ring without adjusting the watchdog timer. Did you see the effects for em under UP? Under SMP, the race decrementing the timer made it hard to tell what caused watchdog timeouts. Bruce