Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Jun 2007 18:12:33 -0700
From:      John Polstra <jdp@polstra.com>
To:        Julian Elischer <julian@elischer.org>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: in-kernel tcp server
Message-ID:  <4679D081.7070600@polstra.com>
In-Reply-To: <467999C9.9000402@elischer.org>
References:  <c4630b800706180227x2f1f433dr4ef55e8623062bf1@mail.gmail.com>	 <467787EF.9060009@elischer.org> <46797825.10900@polstra.com>	 <c4630b800706201239jdf09685t1574e78493492029@mail.gmail.com>	 <46799032.5060009@polstra.com> <c4630b800706201350p176ddadcu6b4eb341751d94e7@mail.gmail.com> <467999C9.9000402@elischer.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer wrote:

> I would actually like to address the performance issues.
> 
> is there any chance the oldest version (4.x based) might be released,
> or at least it would be nice to get the code snippet that attaches to eh 
> ng_ksocket and
> reads and writes the stream..
> 
> I could make a TCP ECHO node that way and use it for tracking down the 
> bottlenecks
> I'm not too interested in the actual webserver itself.

I don't have the ksocket version any more.  It was an early experiment 
(in 2001) that I discarded pretty quickly.

The later 4.x-based version that bypassed the TCP stack and socket layer 
performed well on uniprocessor systems.  I didn't feel netgraph was a 
performance problem at all on that version.  But as multiprocessor 
systems became more mainstream, the 4.x version wasn't able to take 
advantage of the added CPUs.  Also, it didn't support ACPI and had 
trouble booting on some of the newer hardware.

For those reasons, I updated to a 7.x-based system.  At that point, the 
newer SMP-friendly netgraph started to impact performance pretty 
seriously.  The allocation/deallocation of netgraph's queue items seemed 
to be a big part of the problem.  In 4.x we just passed mbufs around, 
without any other allocations or deallocations.  In 7.x, the mbufs are 
wrapped up in queue items that have to be allocated and freed, and that 
added a lot of overhead.  I think also that the reader-writer locking in 
netgraph was impacting performance.  It's a really elegant locking 
scheme, but my node graphs were so simple that I didn't really need it.

I don't view netgraph as having serious performance problems.  It's just 
that I was aiming for maximal performance (in terms of HTTP sessions per 
second), and was willing to do otherwise unreasonable things to get it.

John



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4679D081.7070600>