From owner-cvs-all@FreeBSD.ORG Sun Dec 24 01:56:44 2006 Return-Path: X-Original-To: cvs-all@freebsd.org Delivered-To: cvs-all@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1CB2616A5BB; Sun, 24 Dec 2006 01:56:44 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.freebsd.org (Postfix) with ESMTP id AB32713C459; Sun, 24 Dec 2006 01:56:43 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout2.pacific.net.au (Postfix) with ESMTP id C13686E0CA; Sun, 24 Dec 2006 12:56:40 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id BB0658C15; Sun, 24 Dec 2006 12:56:40 +1100 (EST) Date: Sun, 24 Dec 2006 12:56:39 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Oleg Bulyzhin In-Reply-To: <20061223215918.GA33627@lath.rinet.ru> Message-ID: <20061224124016.F24444@delplex.bde.org> References: <200612201203.kBKC3MhO053666@repoman.freebsd.org> <20061220132631.GH34400@FreeBSD.org> <20061222003115.R16146@delplex.bde.org> <20061223215918.GA33627@lath.rinet.ru> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: cvs-src@freebsd.org, Gleb Smirnoff , src-committers@freebsd.org, Bruce Evans , cvs-all@freebsd.org Subject: Re: cvs commit: src/sys/dev/bge if_bge.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 24 Dec 2006 01:56:44 -0000 On Sun, 24 Dec 2006, Oleg Bulyzhin wrote: > On Fri, Dec 22, 2006 at 01:24:45AM +1100, Bruce Evans wrote: >> On Wed, 20 Dec 2006, Gleb Smirnoff wrote: >>> >>> I have a suspicion that this may cause a problem under high load. Imagine >>> that thread #1 is spinning in bge_start_locked() getting packets out >>> of interface queue and putting them into TX ring. Some other threads are >>> putting the packets into interface queue while its lock is temporarily >>> relinguished be the thread #1. In the same time interrupts happen, some >>> packets are sent, but the TX ring is never got empty. >>> >>> The above scenario will cause a fake watchdog event. >> >> bge_start_locked() starts with the bge (sc) lock held and never releases >> it as far as I can see. This this problem can't happen (the lock >> prevents both txeof and the watchdog from being reached before start >> resets the timeout to 5 seconds). > it's quite unusal) and it is not lock related: > 1) bge_start_locked() & bge_encap fills tx ring. > 2) during next 5 seconds we do not have packets for transmit (i.e. no > bge_start_locked() calls --> no bge_timer refreshing) > 3) for any reason (don't ask me how can this happen), chip was unable to > send whole tx ring (only part of it). > 4) here we have false watchdog - chip is not wedged but bge_watchdog would > reset it. Then it is a true watchdog IMO. Something is very wrong if you can't send 512 packets in 5 seconds (or even 1 packet in 5/512 seconds). > I'd say correct solution would be combination of yours and previous bge_txeof() > behaviour: > 1) We should cancel watchdog if tx ring is empty (as you did) > 2) We should not clear bge_timer inside loop (like it was before), but set it > to 5. So watchdog is active until tx ring is empty and we will not get > watchdog event while tx ring consumer index is moving. I prefer the overall timeout of 5 seconds (maybe even 1 second, but the simple algorithm requires at least an average of 1.5 seconds). I don't want the watchdog to be delayed by 512*5 seconds in the unlikely event that 1 packet trickles out every 5 seconds, especially since implementing this takes more code (though the amount is tiny). This won't happen of course. No interface-start calls in 5 seconds will also be rare. Bruce