Date: Fri, 02 Mar 2007 14:32:29 +0100 From: Andre Oppermann <andre@freebsd.org> To: freebsd-current@freebsd.org, freebsd-net@freebsd.org Cc: gallatin@freebsd.org, rwatson@freebsd.org, kmacy@freebsd.org Subject: New optimized soreceive_stream() for TCP sockets, proof of concept Message-ID: <45E8276D.60105@freebsd.org>
next in thread | raw e-mail | index | archive | help
Currently we are using the generic soreceive_generic() function to pull and copy data from the socket buffer to userland. It is a huge function that can deal with all eventualities and types of data that may happen on socket buffers. From a performance point of view most importantly it does a unlock- lock cycle per mbuf data segment that is copied out. This is neccessary to avoid deadlocks. On high speed TCP connections this leads to high locking overhead and contention on the receive socket buffer lock as both the upper and the lower half have to compete. The lower half wants to add newly received data while the upper half wants to move it to userland and the application. This patch takes a different approach by adding a specific soreceive_stream() function that is highly optimized for stream type sockets as TCP uses. On the send side we've done this differentiation in a different way a long time ago. Instead of the unlock-lock dance soreceive_stream() pulls a properly sized (relative to the receive system call buffer space) from the socket buffer drops the lock and gives copyout as much time as it needs. In the mean time the lower half can happily add as many new packets as it wants without having to wait for a lock. It also allows the upper and lower halfs to run on different CPUs without much interference. There is a unsolved nasty race condition in the patch though. When the socket closes and we still have data around or the copyout failed it tries to put the data back into the socket buffer which is gone already by then leading to a panic. Work is underway to find a realiable fix for this. I wanted to get this out to the community nonetheless to give it some more exposure. The patch is here: http://people.freebsd.org/~andre/soreceive_stream-20070302.diff Any testing, especially on 10Gig cards, and feedback appreciated. -- Andre
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?45E8276D.60105>