Date: Fri, 24 Apr 2009 10:59:30 -0400 From: John Baldwin <jhb@freebsd.org> To: freebsd-hackers@freebsd.org Subject: Re: Using bus_dma(9) Message-ID: <200904241059.30788.jhb@freebsd.org> In-Reply-To: <20090423195928.GB8531@server.vk2pj.dyndns.org> References: <20090423195928.GB8531@server.vk2pj.dyndns.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Thursday 23 April 2009 3:59:28 pm Peter Jeremy wrote: > I'm currently trying to port some code that uses bus_dma(9) from > OpenBSD to FreeBSD and am having some difficulties in following the > way bus_dma is intended to be used on FreeBSD (and how it differs from > Net/OpenBSD). Other than the man page and existing FreeBSD drivers, I > am unable to locate any information on bus_dma care and feeding. Has > anyone written any tutorial guide to using bus_dma? > > The OpenBSD man page provides pseudo-code showing the basic cycle. > Unfortunately, FreeBSD doesn't provide any similar pseudo-code and > the functionality is distributed somewhat differently amongst the > functions (and the drivers I've looked at tend to use a different > order of calls). > > So far, I've hit a number of issues that I'd like some advice on: > > Firstly, the OpenBSD model only provides a single DMA tag for the > device at attach() time, whereas FreeBSD provides the parent's DMA tag > at attach time and allows the driver to create multiple tags. Rather > than just creating a single tag for a device, many drivers create a > device tag which is only used as the parent for additional tags to > handle receive, transmit etc. Whilst the need for multiple tags is > probably a consequence of moving much of the dmamap information from > OpenBSD bus_dmamap_create() into FreeBSD bus_dma_tag_create(), the > rationale behind multiple levels of tags is unclear. Is this solely > to provide a single point where overall device DMA characteristics & > limitations can be specified or is there another reason? Many drivers provide a parent "driver" tag specifically to have a single point, yes. > Secondly, bus_dma_tag_create() supports a BUS_DMA_ALLOCNOW flag that > "pre-allocates enough resources to handle at least one map load > operation on this tag". However it also states "[t]his should not be > used for tags that only describe buffers that will be allocated with > bus_dmamem_alloc()" - does this mean that only one of bus_dmamap_load() > or bus_dmamap_alloc() should be used on a tag/mapping? Or is the > sense backwards (ie "don't specify BUS_DMA_ALLOCNOW for tags that are > only used as the parent for other tags and never mapped themselves")? > Or is there some other explanation. What happens usually now is that each thing you want to pre-alloc memory for using bus_dmamem_alloc() (such as descriptor rings) uses its own tag. This is somewhat mandated by the fact that bus_dmamem_alloc() doesn't take a size but gets the size to allocate from the tag. So usually a NIC driver will have 3 tags: 1 for the RX ring, 1 for packet data, and 1 for the TX ring. Some drivers have 2 tags for packet data, 1 for TX buffers and 1 for RX buffers. > Thirdly, bus_dmamap_load() has a uses a callback function to return > the actual mapping details. According to the man page, there is no > way to ensure that the callback occurs synchronously - a caller can > only request that bus_dmamap_load() fail if resources are not > immediately available. Despite this, many drivers pass 0 for flags > (allowing an asynchronous invocation of the callback) and then fail > (and cleanup) if bus_dmamap_load() returns EINPROGRESS. This appears > to open a race condition where the callback and cleanup could occur > simultaneously. Mitigating the race condition seems to rely on one of > the following two behaviours: > > a) The system is implicitly single-threaded when bus_dmamap_load() is > called (generally as part of the device attach() function). Whilst > this is true at boot time, it would not be true for a dynamically > loaded module. > > b) Passing BUS_DMA_ALLOCNOW to bus_dma_tag_create() guarantees that > the first bus_dmamap_load() on that tag will be synchronous. Is this > true? Whilst it appears to be implied, it's not explicitly stated. That doesn't really guarantee that either as the pool of bounce pages can be shared across multiple tags. I think what you might be missing is this: c) bus_dmamap_load() of a map returned from bus_dmamem_alloc() will always succeed synchronously. That is the only case other than BUS_DMA_NOWAIT where one can assume synchronous calls to the callback. Also, some bus_dma calls basically assumes BUS_DMA_NOWAIT such as bus_dmamap_load_mbuf() and bus_dmamap_load_mbuf_sg(). > Finally, what are the ordering requirements between the alloc, create, > load and sync functions? OpenBSD implies that the normal ordering is > create, alloc, load, sync whilst several FreeBSD drivers use > tag_create, alloc, load and then create. FreeBSD uses the same ordering as OpenBSD. I think you might be confused by the bus_dmamem_alloc() case. There are basically two cases, the first is preallocating a block of RAM to use for a descriptor or command ring: alloc_ring: bus_dma_tag_create(..., &ring_tag); /* Creates a map internally. */ bus_dmamem_alloc(ring_tag, &p, ..., &ring_map); /* Will not fail with EINPROGRESS. */ bus_dmamap_load(ring_tag, ring_map, p, ...); free_ring: bus_dmamap_unload(ring_tag, ring_map); bus_dmamem_free(ring_tag, p, ring_map); bus_dma_tag_destroy(ring_tag); The second case is when you want to handle data transfer requests (queue a packet or disk I/O request, etc.). For this the typical model in FreeBSD is to create a single tag and then pre-create a map for each descriptor or command: setup_data_maps: bus_dma_tag_create(..., &tag); for (i = 0; i < NUM_RXD; i++) bus_dmamap_create(tag, ..., &rxdata[i].map); for (i = 0; i < NUM_TXD; i++) bus_dmamap_create(tag, ..., &txdata[i].map); queue_a_rx_buffer: i = index_of_first_free_RX_descriptor; m = m_getcl(...); rxdata[i].mbuf = m; bus_dmamap_load_mbuf_sg(tag, rxdata[i].map, m, ...); /* populate s/g list in i'th RX descriptor ring */ bus_dmamap_sync(rx_ring_tag, rx_ring_map, ...); dequeue_an_rx_buffer_on_rx_completion: i = index_of_completed_receive_descriptor; bus_dmamap_sync(tag, rxdata[i].map, ...); bus_dmamap_unload(tag, rxdata[i].map); m = rxdata[i].mbuf; rxdata[i].mbuf = NULL; ifp->if_input(m); free_data_maps: for (i = 0; i < NUM_RXD; i++) bus_dmamap_destroy(tag, ..., rxdata[i].map); for (i = 0; i < NUM_TXD; i++_ bus_dmamap_destroy(tag, ..., txdata[i].map); bus_dma_tag_destroy(tag); In a typical NIC driver you will probably be doing alloc_ring and setup_data_maps at the same time during your attach routine. Similarly for free_ring and free_data_maps during detach. > As a side-note, the manpage does not document the behaviour when > bus_dmamap_destroy() or bus_dma_tag_destroy() are called whilst a > bus_dmamap_load() callback is queued. Is the callback cancelled > or do one or both destroy operations fail? Looking at amd64, if a tag has created maps it will fail with EBUSY on HEAD (this may not be in 7.x yet). If a map is destroyed that has bounce buffers in use it will fail with EBUSY as well. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200904241059.30788.jhb>