From owner-cvs-all@FreeBSD.ORG Fri Dec 22 15:00:47 2006 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 62D4D16A503; Fri, 22 Dec 2006 15:00:47 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 0C9DA13C4BF; Fri, 22 Dec 2006 15:00:46 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from [127.0.0.1] (pooker.samsco.org [168.103.85.57]) (authenticated bits=0) by pooker.samsco.org (8.13.4/8.13.4) with ESMTP id kBMF0bk3014094; Fri, 22 Dec 2006 08:00:44 -0700 (MST) (envelope-from scottl@samsco.org) Message-ID: <458BF314.3000702@samsco.org> Date: Fri, 22 Dec 2006 10:00:36 -0500 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.0.7) Gecko/20060910 SeaMonkey/1.0.5 MIME-Version: 1.0 To: Bruce Evans References: <200612201203.kBKC3MhO053666@repoman.freebsd.org> <20061220132631.GH34400@FreeBSD.org> <20061222003115.R16146@delplex.bde.org> <20061223011349.O2603@epsplex.bde.org> In-Reply-To: <20061223011349.O2603@epsplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.4 required=3.8 tests=ALL_TRUSTED autolearn=failed version=3.1.1 X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on pooker.samsco.org Cc: cvs-all@FreeBSD.org, cvs-src@FreeBSD.org, Gleb Smirnoff , Bruce Evans , src-committers@FreeBSD.org Subject: Re: cvs commit: src/sys/dev/bge if_bge.c X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Dec 2006 15:00:47 -0000 Bruce Evans wrote: > On Fri, 22 Dec 2006, I wrote: > >> bge_start_locked() starts with the bge (sc) lock held and never releases >> it as far as I can see. This this problem can't happen (the lock >> prevents both txeof and the watchdog from being reached before start >> resets the timeout to 5 seconds). >> >> I could only find the lock being released and reacquired in a nested >> routine in bge_rxeof() (for calling if_input()). I hope this >> complication >> is never needed for start routines. > > Releasing the lock in in rxeof seems dangerous, but all network drivers > do it so there is some chance that it works, and I thought that it worked > as follows in one race case: race with device unload or just an ioctl: > - the device may have been reset while we were in if_input(), so we must > not use much state. bge_txeof() and bge_intr() are simple enough to > be obviously correct here, but bge_rxeof() isn't and seems to be broken > (see below) > - unload must wait for the interrupt handler to complete before removing > the interrupt handler's code. This should be handled by normal interrupt > handler rundown. The order seems to be: acquire sc lock; reset and stop > further interrupts; release sc lock; wait for interrupt handler to > finish. > > Just a few hours after I thought this, a RELENG_6 kernel crashed on a > null pointer in bge_rxeof(). I had just started rebooting using > reboot(8) with no args, while the interface was being blasted with > tiny packets using ttcp. The reboot process seems to have shut down > the interface and was waiting for something. I didn't try to understand > the null pointer. The em driver was getting into problems in a similar way because it was freeing and reallocating data structures in its reset routine. This was a bug on many levels, but it has been fixed. I didn't see similar bugs in the bge driver, but there could easily be others. In any case, the driver lock must not be held when calling if_input because there are many ways the codepath can loop from there back into if_start of the same driver or another driver. There is no way around this without doing big decoupling steps that will impact common performance cases. That said, dropping and regrabbing the driver lock in the rxeof routine of any driver is bad. It may be safe to do, but it incurs horrible performance penalties. It essentially allows the time-critical, high priority RX path to be constantly preempted by the lower priority if_start or if_ioctl paths. Even without this preemption and priority inversion, you're doing an excessive number of expensive lock ops in the fast path. The key to dropping the lock here safely, or running without a lock at all (like if_em) is to synchronize dangerous operations at a higher level. For driver unload, you need to block the interrupt and call bus_teardown_intr before freeing structures that could be touched by the interrupt handler. For taskqueue and callout driven drivers, you need to drain and block those contexts before doing anything dangerous. It's not impossible to do, you just have to be careful, think about all of the entry points, and not assume that mutexes have magical powers. Scott