Date: Wed, 26 Oct 2005 17:11:14 +0200 From: =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@FreeBSD.ORG> To: Scott Long <scottl@samsco.org> Cc: freebsd-amd64@FreeBSD.ORG Subject: Re: busdma dflt_lock on amd64 > 4 GB Message-ID: <08A81034-AB5D-4BFC-8F53-21501073D674@FreeBSD.ORG> In-Reply-To: <435F8E06.9060507@samsco.org> References: <6.2.3.4.0.20051025171333.03a15490@pop.interactivemediafactory.net> <6.2.3.4.0.20051026131012.03a80a20@pop.interactivemediafactory.net> <435F8E06.9060507@samsco.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 26/10/2005, at 16:09, Scott Long wrote: > Jacques Caron wrote: >> Hi all, >> Continuing on this story... [I took the liberty of CC'ing Scott =20 >> and Soren], pr is amd64/87977 though it finally isn't amd64-=20 >> specific but >4GB-specific. >> There is really a big problem somewhere between ata and bus_dma =20 >> for boxes with more than 4 GB RAM and more than 2 ata disks: >> * bounce buffers will be needed >> * ata will have bus_dma allocate bounce buffers: >> hw.busdma.zone1.total_bpages: 32 >> hw.busdma.zone1.free_bpages: 32 >> hw.busdma.zone1.reserved_bpages: 0 >> hw.busdma.zone1.active_bpages: 0 >> hw.busdma.zone1.total_bounced: 27718 >> hw.busdma.zone1.total_deferred: 0 >> hw.busdma.zone1.lowaddr: 0xffffffff >> hw.busdma.zone1.alignment: 2 >> hw.busdma.zone1.boundary: 65536 >> * if I do a dd with a bs=3D256000, 16 bounce pages will be used =20 >> (most of the time). As long as I stay on the same disk, no more =20 >> pages will be used. >> * as soon as I access another disk (e.g. with another dd with the =20 >> same bs=3D256000), another set of 16 pages will be used (bus_dma =20 >> tags and maps are allocated on a per-channel basis), and all 32 =20 >> bounce pages will be used (most of the time) >> * and if I try to access a third disk, more bounce pages are =20 >> needed and: >> - one of ata_dmaalloc calls to bus_dma_tag_create has ALLOCNOW set >> - busdma_machdep will not allocate more bounce pages in that case =20 >> (the limit is imposed by maxsize in that situation, which has =20 >> already been reached) >> - ata_dmaalloc will fail >> - but some other bus_dma_tag_create call without ALLOCNOW set will =20= >> still cause bounce pages to be allocated, but deferred, and the =20 >> non-existent lockfunc to be called, and panic. >> Adding the standard lockfunc will (probably) solve the panic =20 >> issue, but there will still be a problem with DMA in ata. >> > > Actually, it won't. It'll result in silent data corruption. What is > happening is that bus_dmamap_load() is returning EINPROGRESS, but the > ATA driver ignores it and assumes that the load failed. Later on the > busdma subsystem tries to run the callback but trips over the =20 > intentional assertion. If the standard lock was used, then the =20 > callback > would succeed and start spamming memory that either had been freed or > is in the process of being used by other ATA commands. Ehm, according to the man page the load should succed for at least =20 one map when the ALLOCNOW flag is set. ATA only use one map so there =20 is no way that spamming can happen. The bug i ATA is that the sg_tag and the work_tag is not created with =20= the ALLOCNOW flag so if all resources are used before they are called =20= things get messy. The below patch takes care of that problem. > So, the panic is doing exactly what it is supposed to do. It's =20 > guarding > against bugs in the driver. The workaround for this is to use the =20 > NOWAIT flag in all instances of bus_dmamap_load() where deferals can > happen. This, however, means that using bounce pages still remains =20= > fragile and that the driver is still likely to return ENOMEM to the =20= > upper layers. C'est la vie, I guess. At one time I had patches that > made ATA use the busdma API correctly (it is one of the few remaining > that does not), but they rotted over time. As long as ATA doesn't do tags there is no gain by changing this at =20 all except spamming the code with all the callback crap thats not =20 needed. According to the man page bus_dmamap_load takes no flags, so thats =20 why thats not done. Besides its not needed as shown above. S=F8ren Schmidt sos@FreeBSD.org
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?08A81034-AB5D-4BFC-8F53-21501073D674>