Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Jun 2010 11:30:49 -0700
From:      Pyun YongHyeon <pyunyh@gmail.com>
To:        John Baldwin <jhb@freebsd.org>
Cc:        svn-src-head@freebsd.org, Scott Long <scottl@samsco.org>, Marcel Moolenaar <xcllnt@mac.com>, src-committers@freebsd.org, svn-src-all@freebsd.org
Subject:   Re: svn commit: r209026 - in head/sys/ia64: ia64 include
Message-ID:  <20100611183049.GH13776@michelle.cdnetworks.com>
In-Reply-To: <201006111420.46919.jhb@freebsd.org>
References:  <201006110300.o5B30X9q045387@svn.freebsd.org> <9F065122-7D91-42E9-A251-5AF4AAF0B4E5@samsco.org> <20100611175016.GD13776@michelle.cdnetworks.com> <201006111420.46919.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jun 11, 2010 at 02:20:46PM -0400, John Baldwin wrote:
> On Friday 11 June 2010 1:50:16 pm Pyun YongHyeon wrote:
> > On Fri, Jun 11, 2010 at 11:44:39AM -0600, Scott Long wrote:
> > > On Jun 11, 2010, at 11:41 AM, Pyun YongHyeon wrote:
> > > > On Fri, Jun 11, 2010 at 11:37:36AM -0600, Scott Long wrote:
> > > >> On Jun 11, 2010, at 11:32 AM, Marcel Moolenaar wrote:
> > > >>> 
> > > >>> On Jun 11, 2010, at 10:21 AM, Scott Long wrote:
> > > >>> 
> > > >>>> On Jun 11, 2010, at 11:04 AM, Marcel Moolenaar wrote:
> > > >>>>> 
> > > >>>>> On Jun 11, 2010, at 9:12 AM, Scott Long wrote:
> > > >>>>> 
> > > >>>>>> On Jun 11, 2010, at 5:51 AM, John Baldwin wrote:
> > > >>>>>>> On Thursday 10 June 2010 11:00:33 pm Marcel Moolenaar wrote:
> > > >>>>>>>> Author: marcel
> > > >>>>>>>> Date: Fri Jun 11 03:00:32 2010
> > > >>>>>>>> New Revision: 209026
> > > >>>>>>>> URL: http://svn.freebsd.org/changeset/base/209026
> > > >>>>>>>> 
> > > >>>>>>>> Log:
> > > >>>>>>>> Bump MAX_BPAGES from 256 to 1024. It seems that a few drivers, 
> bge(4)
> > > >>>>>>>> in particular, do not handle deferred DMA map load operations at 
> all.
> > > >>>>>>>> Any error, and especially EINPROGRESS, is treated as a hard error 
> and
> > > >>>>>>>> typically abort the current operation. The fact that the busdma 
> code
> > > >>>>>>>> queues the load operation for when resources (i.e. bounce buffers 
> in
> > > >>>>>>>> this particular case) are available makes this especially 
> problematic.
> > > >>>>>>>> Bounce buffering, unlike what the PR synopsis would suggest, 
> works
> > > >>>>>>>> fine.
> > > >>>>>>>> 
> > > >>>>>>>> While on the subject, properly implement swi_vm().
> > > >>>>>>> 
> > > >>>>>>> NIC drivers do not handle deferred load operations at all (note 
> that 
> > > >>>>>>> bus_dmamap_load_mbuf() and bus_dmamap_load_mbuf_sg() enforce 
> BUS_DMA_NOWAIT).
> > > >>>>>>> It is common practice to just drop the packet in that case.
> > > >>>>>>> 
> > > >>>>>> 
> > > >>>>>> Yes, long ago when network drivers started being converted to 
> busdma, it was agreed that EINPROGRESS simply doesn't make sense for them.  
> Any platform that winds up making extensive use of bounce buffers for network 
> hardware is going to perform poorly no matter what, and should hopefully have 
> some sort of IOMMU that can be used instead.
> > > >>>>> 
> > > >>>>> Unfortunately things aren't as simple as is presented.
> > > >>>>> 
> > > >>>>> For one, bge(4) wedges as soon as the platform runs out of bounce
> > > >>>>> buffers when they're needed. The box needs to be reset in order to
> > > >>>>> get the interface back. I pick any implementation that remains
> > > >>>>> functional over a mis-optimized one that breaks. Deferred load
> > > >>>>> operations are more performance optimal than failure is.
> > > >>>>> 
> > > >>>> 
> > > >>>> This sounds like a bug in the bge driver.  I don't see if through 
> casual inspection, but the driver should be able to either drop the mbuf 
> entirely, or requeue it on the ifq and then restart the ifq later.
> > > >>>> 
> > > >>>>> Also: the kernel does nothing to guarantee maximum availability
> > > >>>>> of DMA-able memory under load, so bounce buffers (or use of I/O
> > > >>>>> MMUs for that matter) are a reality. Here too the performance
> > > >>>>> argument doesn't necessarily hold because the kernel may be
> > > >>>>> busy with more than just sending and receiving packets and the
> > > >>>>> need to defer load operations is very appropriate. If the
> > > >>>>> alternative is just dropped packets, I'm fine with that too,
> > > >>>>> but I for one cannot say that *not* filling a H/W ring with
> > > >>>>> buffers is not going to wedge the hardware in some cases.
> > > >>>>> 
> > > >>>>> Plus: SGI Altix does not have any DMA-able memory for 32-bit
> > > >>>>> hardware. The need for an I/O MMU is absolute and since there
> > > >>>>> are typically less mapping registers than packets on a ring,
> > > >>>>> the need for deferred operation seems quite acceptable if the
> > > >>>>> alternative is, again, failure to operate.
> > > >>>>> 
> > > >>>> 
> > > >>>> I'm not against you upping the bounce buffer limit for a particular 
> platform, but it's still unclear to me if (given bug-free drivers) it's worth 
> the effort to defer a load rather than just drop the packet and let the stack 
> retry it.  One question that would be good to answer is wether the failed load 
> is happening in the RX to TX path.
> > > >>> 
> > > >>> RX path I believe.
> > > >>> 
> > > >> 
> > > >> I'm not clear why you even need bounce buffers for RX.  The chip 
> supports 64bit addresses with no boundary or alignment restrictions.
> > > >> 
> > > > 
> > > > Some controllers have 4G boundary bug so bge(4) restricts dma
> > > > address space.
> > > 
> > > That limitation should be reflected in the boundary attribute of the tag, 
> not the lowaddr/highaddr attributes.
> > > 
> > 
> > Yes, but that needed more code. And I don't have these buggy
> > controllers so I chose more simple way that would work even though
> > it may be inefficient.
> 
> You can just use a 2GB boundary as a workaround.  Look at what the twa(4) 
> driver does to enforce a 4GB boundary for an example.
> 

I vaguely remember the problem was DMA memory allocated with
bus_dmamem_alloc(9) does not honor boundary argument so bge(4) had
to ensure the allocated memory is within 4GB.

> -- 
> John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100611183049.GH13776>