From owner-freebsd-amd64@FreeBSD.ORG Wed Oct 26 16:01:53 2005 Return-Path: X-Original-To: freebsd-amd64@freebsd.org Delivered-To: freebsd-amd64@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EC5AF16A41F; Wed, 26 Oct 2005 16:01:52 +0000 (GMT) (envelope-from jc@oxado.com) Received: from mars.interactivemediafactory.net (mars.imfeurope.net [194.2.222.161]) by mx1.FreeBSD.org (Postfix) with ESMTP id 28C4043D4C; Wed, 26 Oct 2005 16:01:51 +0000 (GMT) (envelope-from jc@oxado.com) Received: from JC-8600.oxado.com (localhost [127.0.0.1]) by mars.interactivemediafactory.net (8.12.11/8.12.11) with ESMTP id j9QG1kka034893; Wed, 26 Oct 2005 18:01:48 +0200 (CEST) (envelope-from jc@oxado.com) Message-Id: <6.2.3.4.0.20051026163501.03b7d3e8@wheresmymailserver.com> X-Mailer: QUALCOMM Windows Eudora Version 6.2.3.4 Date: Wed, 26 Oct 2005 18:01:34 +0200 To: Scott Long From: Jacques Caron In-Reply-To: <435F8E06.9060507@samsco.org> References: <6.2.3.4.0.20051025171333.03a15490@pop.interactivemediafactory.net> <6.2.3.4.0.20051026131012.03a80a20@pop.interactivemediafactory.net> <435F8E06.9060507@samsco.org> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Cc: freebsd-amd64@freebsd.org, sos@freebsd.org Subject: Re: busdma dflt_lock on amd64 > 4 GB X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Oct 2005 16:01:53 -0000 Hi Scott, Thanks for the input. I'm utterly lost in unknown terrain, but I'm trying to understand... At 16:09 26/10/2005, Scott Long wrote: >So, the panic is doing exactly what it is supposed to do. It's guarding >against bugs in the driver. The workaround for this is to use the >NOWAIT flag in all instances of bus_dmamap_load() where deferals can >happen. As pointed out by Soren, this is not documented in man bus_dma :-/ It says bus_dmamap_load flags are supposed to be 0, and BUS_DMA_ALLOCNOW should be set at tag creation to avoid EINPROGRESS. I'm not sure the two would actually be equivalent, either. And from what I understand, even a call to bus_dma_tag_create with BUS_DMA_ALLOCNOW can be successful but not actually allocate what will be needed later (see below). > This, however, means that using bounce pages still remains > fragile and that the driver is still likely to return ENOMEM to the > upper layers. C'est la vie, I guess. At one time I had patches that >made ATA use the busdma API correctly (it is one of the few remaining >that does not), but they rotted over time. So what would be the "correct" way? Move the part that's after the DMA setup in the callback? I suppose there are limitations as to what can happen in the callback, though, so it would complicate things quite a bit. Obviously, a lockfunc would be needed in this situation, right? Also, I believe many other drivers just have lots of BUS_DMA_ALLOCNOW or BUS_DMA_NOWAIT all over the place, I'm not sure that's the "correct" way, is it? >No. Some tags specifically should not permit deferals. How do they do that? Setting BUS_DMA_ALLOCNOW in the tag, or BUS_DMA_NOWAIT in the map_load, or both, or something else? What should make one decide when deferrals should not be permitted? It is my impression that quite a few drivers happily decide they don't like deferrals at all whatever happens... >Just about every other modern driver honors the API correctly. Depends what you mean by "correctly". I'm not sure using BUS_DMA_NOWAIT is the right way to go as it fails if there is contention for bounce buffers. >Bounce pages cannot be reclaimed to the system, so overallocating just >wastes memory. I'm not talking about over-allocating, but rather allocating what is needed: I don't understand why bus_dma_tag_create limits the total number of bounce pages in a bounce zone to maxsize if BUS_DMA_ALLOCNOW is set (which prevents bus_dmamap_create from allocating any further bounce pages as long as there's only one map per tag, which seems pretty common), while bus_dmamap_create will allocate maxsize additional pages if BUS_DMA_ALLOCNOW was not set. The end result is that the ata driver is limited to 32 bounce pages whatever the number of instances (I guess that's channels, or disks?), while other drivers get hundreds of bounce pages which they hardly use. Maybe this is intended and it's just the way the ata driver uses tags and maps that is wrong, maybe it's the busdma logic that is wrong, I don't know... > The whole point of the deferal mechanism is to allow >you to allocate enough pages for a normal load while also being able to >handle sporadic spikes in load (like when the syncer runs) without >trapping memory. In this case 32 bounce pages (out of 8 GB RAM) for 6 disks seems like a very tight bottleneck to me. Jacques.