Date: Thu, 24 Jul 2014 15:25:37 +0800 From: Xu Zhe <xzpeter@gmail.com> To: Jack Vogel <jfvogel@gmail.com> Cc: FreeBSD Net <freebsd-net@freebsd.org> Subject: Re: Question on rx queue in ixgbe driver Message-ID: <53D0B4F1.2030408@gmail.com> In-Reply-To: <CAFOYbck6fLbL19g5APmFrp2UnfhETf7ncFTuQdC2Q%2BibCHHzFw@mail.gmail.com> References: <53CFAC8F.8090404@gmail.com> <CAFOYbck6fLbL19g5APmFrp2UnfhETf7ncFTuQdC2Q%2BibCHHzFw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi, Jack, Thanks for the quick response and detailed explaination. :) I think now I understand why tail is always one less than head, if it is the rule to play from the hardware datasheet. As for the init value, I still not quite clear why their starting state should be all zeros. I had traced in real test with ixgbe driver version 2.5.8 that, when the specific ix0 inited (set the first IP), ixgbe_msix_que would be triggered once (even no packet received I suppose, don't know why for now), and rx_ring descriptors are re-inited since ixgbe_rx_unrefreshed would assume there are unrefreshed descriptors (though there is not, since rx_head == rx_tail == 0). Then during refresh in ixgbe_refresh_mbufs(), the loop will go over 2048 times to re-init all the descriptors. IMHO, another way to change this situation might be in ixgbe_rx_unrefreshed: when head equals to tail, we could return zero since it's during start phase. After all, this should not a big problem though, since it only happens when interface first init (or down -> up), and it should not exist in the latest ixgbe 2.5.15 driver (in the latest driver, I found that ixgbe_msix_que would return directly when failed to find DRV_RUNNING flag). Peter δΊ 14-7-24 1:32, Jack Vogel ει: > HEAD and TAIL are actually hw registers, the driver as it is configured these > days never (and cannot) modify HEAD, There is an option to use the HEAD > register as a method of managing the RX side (called Head Writeback), but > in ixgbe this is not used, rather we rely on the DD bit of a descriptor to be > written back by the hardware to indicate an operation is complete. > > The span between HEAD and TAIL is the extent of usable buffers for the hw. > In the starting state having them equal indicates a clean slate, however once > operating TAIL will trail HEAD, the range between them being "uncleaned" > > The next_to_check and next_to_refresh indices are used by the driver to manage > the ring. I designed them to allow a seperated operation of cleaning and of > refreshing > the buffers. In old drivers these two were always in lock step, and you > could never > send a packet to the stack until you FIRST did the mbuf allocation, then if that > failed for any reason the packet was actually dropped in order to keep that > precious > mbuf cluster. Now refresh can be deferred as many as 8 while cleaning. > > These two indices are each exclusively controlled by those two operations, > and its > only refresh, by writing the TAIL register, that actually limits what the hw > engine > can do, since it will not go beyond TAIL. > > There is no "should be" on those indices, they are set up so things operate > correctly, and they do :) > > Hope this clarifies things somewhat? > > Jack > > > > > On Wed, Jul 23, 2014 at 5:37 AM, Xu Zhe <xzpeter@gmail.com > <mailto:xzpeter@gmail.com>> wrote: > > Hi, > > I am reading ixgbe driver of Freebsd and got some problems, which are > described below. > > (1) Why rxd_tail does not equals to rxd_head? > > Here, rxd_tail/rxd_head is the value of > dev.ix.0.queue0.rxd_tail/dev.ix.0.queue0.rxd_head from sysctl (take the > first > queue of ix0 as example). > > Actually, in most cases, rxd_head - rxd_tail == 1. > > Refers to the code, these values are actually next_to_refresh/next_to_check > for each receive queue (though the sysctl implementation is read > directly from > the hardware registers I suppose). Why next_to_refresh is always one smaller > than next_to_check (of course, when the latter is 0, the former is 2047 > possibly, which is the size of rx ring - 1)? In my point of view, this means > that we will always have one tiny mbuf (which is pointed by next_to_refresh) > that is already checked (passed up to upper network stack) but not refreshed > (not prepared for the next receive). It does not make sense? Or I missed > anything important? > > (2) The init value of rxd_tail > > Even if (1) has no problem, when ixgbe device is inited, rxd_tail (or say, > next_to_refresh) is set to zero (in function ixgbe_setup_receive_ring). I > think it should be rxr->num_desc - 1. > > This should not matter much in the latest ixgbe driver, but it might > cause old > driver (ixgbe 2.5.8 at least) to double init the rx_ring descriptors > (both in > ixgbe_setup_receive_ring when ixgbe init up, and the first entry of > ixgbe_refresh_mbufs of the first interrupt come). This is a case I met in my > test environment. > > ========== > > Looking forward to any of your replies to help clarify my thoughts. > Thanks in > advance. > > Peter > _______________________________________________ > freebsd-net@freebsd.org <mailto:freebsd-net@freebsd.org> mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org > <mailto:freebsd-net-unsubscribe@freebsd.org>" > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?53D0B4F1.2030408>