Date: Wed, 8 Apr 2009 06:05:08 -0700 (PDT) From: Barney Cordoba <barney_cordoba@yahoo.com> To: Ivan Voras <ivoras@freebsd.org>, Robert Watson <rwatson@FreeBSD.org> Cc: freebsd-net@freebsd.org Subject: Re: Advice on a multithreaded netisr patch? Message-ID: <871699.35154.qm@web63906.mail.re1.yahoo.com> In-Reply-To: <alpine.BSF.2.00.0904061934240.18619@fledge.watson.org>
next in thread | previous in thread | raw e-mail | index | archive | help
--- On Mon, 4/6/09, Robert Watson <rwatson@FreeBSD.org> wrote: > From: Robert Watson <rwatson@FreeBSD.org> > Subject: Re: Advice on a multithreaded netisr patch? > To: "Ivan Voras" <ivoras@freebsd.org> > Cc: freebsd-net@freebsd.org > Date: Monday, April 6, 2009, 2:52 PM > On Mon, 6 Apr 2009, Ivan Voras wrote: > > >> I think we're talking slightly at cross > purposes. There are two > >> transfers of interest: > >> > >> (1) DMA of the packet data to main memory from the > NIC > >> (2) Servicing of CPU cache misses to access data > in main memory > >> > >> By the time you receive an interrupt, the DMA is > complete, so once you > > > > OK, this was what was confusing me - for a moment I > thought you meant it's not so. > > It's a polite lie that we will choose to believe the > purposes of simplification. And probably true for all our > drivers in practice right now. > > >> m = m_pullup(m, sizeof(*w)); > >> if (m == NULL) > >> return; > >> w = mtod(m, struct whatever *); > >> > >> m_pullup() here ensures that the first sizeof(*w) > bytes of mbuf data are contiguously stored so that the cast > of w to m's data will point at a > > > > So, m_pullup() can resize / realloc() the mbuf? (not > that it matters for this purpose) > > Yes -- if it can't meet the contiguity requirements > using the current mbuf chain, it may reallocate and return a > new head to the chain (hence m being reassigned). If that > reallocation fails, it may return NULL. Once you've > called m_pullup(), existing pointers into the chain's > data will be invalid, so if you've already called mtod() > on it, you need to call it again. > > >> - A TCP segment will need to be ACK'd, so if > you're sending data in > >> chunks in > >> one direction, the ACKs will not be piggy-backed > on existing data > >> tranfers, > >> and instead be sent independently, hitting the > network stack two more > >> times. > > > > No combination of these can make an accounting > difference between 1,000 and 250,000 pps. I must be hitting > something very bad here. > > Yes, you definitely want to run tcpdump to see what's > going on here. > > >> - Remember that TCP works to expand its window, > and then maintains the > >> highest > >> performance it can by bumping up against the top > of available bandwidth > >> continuously. This involves detecting buffer > limits by generating > >> packets > >> that can't be sent, adding to the packet > count. With loopback > >> traffic, the > >> drop point occurs when you exceed the size of > the netisr's queue for > >> IP, so > >> you might try bumping that from the default to > something much larger. Robert, Is there any work being done on lighter weight locks for queues? It seems ridiculous to avoid using queues because of lock contention when the locks are only protecting a couple lines of code. Barney
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?871699.35154.qm>