From owner-freebsd-current@FreeBSD.ORG Sat Apr 26 07:33:19 2003 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4F72037B401; Sat, 26 Apr 2003 07:33:19 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id BA80B43F85; Sat, 26 Apr 2003 07:33:17 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3p2/8.8.7) with ESMTP id AAA20075; Sun, 27 Apr 2003 00:33:04 +1000 Date: Sun, 27 Apr 2003 00:33:02 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Poul-Henning Kamp In-Reply-To: <9417.1051286772@critter.freebsd.dk> Message-ID: <20030426231749.H48182@gamplex.bde.org> References: <9417.1051286772@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: gibbs@freebsd.org cc: current@freebsd.org Subject: Re: lots of malloc(M_WAITOK)'s in interrupt context from camisr X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Apr 2003 14:33:19 -0000 On Fri, 25 Apr 2003, Poul-Henning Kamp wrote: > In message <20030426013438.W38897@gamplex.bde.org>, Bruce Evans writes: > > >+ if (flags & M_WAITOK) > >+ if (curthread->td_ithd != NULL || > >+ curthread->td_intr_nesting_level != 0) > >+ Debugger("malloc(M_WAITOK) in interrupt context"); > >[...] > >The mallocs are actually in sysctl support code and other subsystems that > >camisr wanders off into, so the fix isn't just to use M_NOWAIT. > > I would tend to think that while sleeping in interrupt threads > should be avoided, it should only be avoided "at reasonable cost", > not "at any cost". I think it may actually work now in some cases, since there is now an ithread context to sleep on, but camisr() in general is not such a case, and no cases should be permitted since permitting some messes up correctness checking for the others. The cases that may work are where the sleeper has sufficient locks on its state variables and sleeping doesn't cause deadlock. I think camisr() is needed for disk i/o, so it is not one of these cases except possibly during boot-time initialization (otherwise there is deadlock if cam i/o is required to free pages for malloc()). > For example, CAM creates new devices from its interrupt context and > ends up calling make_dev(). Yes, this is a bug in cam. I didn't mention it because it is more obviously a bug to call sysctl support code than to call device support code, since the sysctl support code is closer to userland and should only be expected to have locking appropriate for user threads. This is currently just Giant locking. kern_sysctl.c currently doesn't reference Giant except at its userland entry point[s], but of course it needs it almost everywhere. There must be a common lock to protect things, so camisr() must be protected by Giant too. It just uses !INTR_MPSAFE. > While I probably could implement make_dev() in a M_NOWAIT fashion, it > is certainly not worth it considering how often/seldom it is used > and the general havoc and delay we usually encounter at the hardware > level when it is called. > > Of course, changing CAM to create devices in some other context than > interrupt would solve this particular instance, but I belive we already > have other similar corner cases. Apparently not many for malloc() from interrupt handlers -- the assertion only failed for camisr(). This is not surprising, since most interrupt handlers are not very different from the RELENG_4 ones which had to pass a nonbroken assertion. I think other corner cases would fail locking assertions more than malloc() ones and we just don't see them yet because almost everything is locked by Giant. > So maybe we need a malloc flag more, to indicate that "Yeah, I know > I really shouldn't sleep here, but the alternative is worse". No, this would be an unreasoanble cost. Support code shouldn't have complications to support callers doing bad things. Bruce