Date: Wed, 20 Jun 2007 18:12:33 -0700 From: John Polstra <jdp@polstra.com> To: Julian Elischer <julian@elischer.org> Cc: freebsd-hackers@freebsd.org Subject: Re: in-kernel tcp server Message-ID: <4679D081.7070600@polstra.com> In-Reply-To: <467999C9.9000402@elischer.org> References: <c4630b800706180227x2f1f433dr4ef55e8623062bf1@mail.gmail.com> <467787EF.9060009@elischer.org> <46797825.10900@polstra.com> <c4630b800706201239jdf09685t1574e78493492029@mail.gmail.com> <46799032.5060009@polstra.com> <c4630b800706201350p176ddadcu6b4eb341751d94e7@mail.gmail.com> <467999C9.9000402@elischer.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Julian Elischer wrote: > I would actually like to address the performance issues. > > is there any chance the oldest version (4.x based) might be released, > or at least it would be nice to get the code snippet that attaches to eh > ng_ksocket and > reads and writes the stream.. > > I could make a TCP ECHO node that way and use it for tracking down the > bottlenecks > I'm not too interested in the actual webserver itself. I don't have the ksocket version any more. It was an early experiment (in 2001) that I discarded pretty quickly. The later 4.x-based version that bypassed the TCP stack and socket layer performed well on uniprocessor systems. I didn't feel netgraph was a performance problem at all on that version. But as multiprocessor systems became more mainstream, the 4.x version wasn't able to take advantage of the added CPUs. Also, it didn't support ACPI and had trouble booting on some of the newer hardware. For those reasons, I updated to a 7.x-based system. At that point, the newer SMP-friendly netgraph started to impact performance pretty seriously. The allocation/deallocation of netgraph's queue items seemed to be a big part of the problem. In 4.x we just passed mbufs around, without any other allocations or deallocations. In 7.x, the mbufs are wrapped up in queue items that have to be allocated and freed, and that added a lot of overhead. I think also that the reader-writer locking in netgraph was impacting performance. It's a really elegant locking scheme, but my node graphs were so simple that I didn't really need it. I don't view netgraph as having serious performance problems. It's just that I was aiming for maximal performance (in terms of HTTP sessions per second), and was willing to do otherwise unreasonable things to get it. John
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4679D081.7070600>