From owner-freebsd-net@FreeBSD.ORG Fri Apr 6 02:31:35 2007 Return-Path: X-Original-To: freebsd-net@freebsd.org Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 114F416A401 for ; Fri, 6 Apr 2007 02:31:35 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout1.pacific.net.au (mailout1-3.pacific.net.au [61.8.2.210]) by mx1.freebsd.org (Postfix) with ESMTP id CE1B713C455 for ; Fri, 6 Apr 2007 02:31:34 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailproxy1.pacific.net.au (mailproxy1.pacific.net.au [61.8.2.162]) by mailout1.pacific.net.au (Postfix) with ESMTP id 5E5135DFD28; Fri, 6 Apr 2007 12:31:31 +1000 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy1.pacific.net.au (Postfix) with ESMTP id 8549F8C0F; Fri, 6 Apr 2007 12:31:30 +1000 (EST) Date: Fri, 6 Apr 2007 12:31:28 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Vladimir Ivanov In-Reply-To: <461549BD.1020800@yandex-team.ru> Message-ID: <20070406121505.T43678@delplex.bde.org> References: <461549BD.1020800@yandex-team.ru> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-net@freebsd.org Subject: Re: Serious bug in most (?) ethernet drivers (bge, bce, ixgb etc.). X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 06 Apr 2007 02:31:35 -0000 On Thu, 5 Apr 2007, Vladimir Ivanov wrote: > We have reported serious bug with em driver > (http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/87418) one year and half > ago. > It's very funny but most freebsd ethernet drivers cloned this bug I seem. > You can see same bug in bce, bge, ixgb and so on. I can only see it in bce and ixgb. bge is much simpler and better -- bge_rxeof() doesn't depend on any state after the unlock/re-lock except the rx indexes, and these are both reset to 0 by reinitialization. However, reinitialization often panics bge_rxeof() anyway. The only reasons for the panics that I can think of is that nothing is declared volatile but the producer index is extremely volatile, so the following races are possible: - compiler caching the indexes. bce implements this as foot-shooting. I think aliasing problems prevent the compiler doing it, so declaring things as volatile would make no difference. - a race with the hardware in initialzation might result in the producer index being nonzero and for old data despite it having been reset to 0 and no new data arriving. Stopping the hardware for initialization should prevent such races. Bruce