From owner-freebsd-net@FreeBSD.ORG Sat Feb 5 00:42:46 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 183D116A4CE for ; Sat, 5 Feb 2005 00:42:46 +0000 (GMT) Received: from mail.vicor-nb.com (bigwoop.vicor-nb.com [208.206.78.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id C3D7343D39 for ; Sat, 5 Feb 2005 00:42:45 +0000 (GMT) (envelope-from julian@elischer.org) Received: from elischer.org (julian.vicor-nb.com [208.206.78.97]) by mail.vicor-nb.com (Postfix) with ESMTP id A02E87A403; Fri, 4 Feb 2005 16:42:45 -0800 (PST) Message-ID: <42041685.6030805@elischer.org> Date: Fri, 04 Feb 2005 16:42:45 -0800 From: Julian Elischer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3.1) Gecko/20030516 X-Accept-Language: en, hu MIME-Version: 1.0 To: Guy Helmer References: <4203AAE3.4090906@palisadesys.com> In-Reply-To: <4203AAE3.4090906@palisadesys.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit cc: freebsd-net@freebsd.org Subject: Re: Netgraph performance question X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 05 Feb 2005 00:42:46 -0000 Guy Helmer wrote: > A while back, Maxim Konovalov made a commit to usr.sbin/ngctl/main.c > to increase its socket receive buffer size to help 'ngctl list' deal > with a big number of nodes, and Ruslan Ermilov responded that setting > sysctls net.graph.recvspace=200000 and net.graph.maxdgram=200000 was > a good idea on a system with a large number of nodes. > > I'm getting what I consider to be sub-par performance under FreeBSD > 5.3 from a userland program using ngsockets connected into ng_tee to > play with packets that are traversing a ng_bridge, and I finally have > an opportunity to look into this. I say "sub-par" because when we've > tested this configuration using three 2.8GHz Xeon machines with > Gigabit Ethernet interfaces at 1000Mbps full-duplex, we obtained peak > performance of a single TCP stream of about 12MB/sec through the > bridging machine as measured by NetPIPE and netperf. that's not bad if you are pushing everything through userland.. That's quite expensive, and the scheduling overheads need to be taken into account too. > > > I'm wondering if bumping the recvspace should help, if changing the > ngsocket hook to queue incoming data should help, if it would be best > to replace ngsocket with a memory-mapped interface, or if anyone has > any other ideas that would help performance. Netgraph was designed to be a "lego for link layer stuff" where link laer stuff was considered to be WAN protocols etc. In particualr the userland interface was written with an eye to prototyping and debugging and doesn't take any special care to be fast. (though I don;t know how you could be faster going to userland). Since then people have broadenned its use considerably, and questionns of its performance have become quite regular. It wasn't designed to be super fast, though it is not bad considerring what it does. There is however a push to look at performance so it would eb interresting to see in more detail what you are doing. in particular, what are you doing in userland? might it make sense to make your own custom netgraph node that does exaclty what you want in the kernel? > > Thanks in advance for any advice, Guy Helmer > I have considderred a memory mapper interface that would bold onto ng_dev. I have done an almost identical interface once before (1986->1992) There would have to be several commands supported. define bufferspace size (ioctl/message) mmap buffer space (mmap) allocate bufferspace to user (size) (returns buffer ID) free bufferspace (ID) getoffset (ID) (returns offset in bufferspace) writebuffer(ID, hook, maxmbufsize) pick up the buffer, put it into mbufs (maybe as external pointers) and send out hook in question. Incoming data would be written into buffers (a cpu copy would be needed) and the ID added to a list of arrived IDs. In addition you need a way to notify a listenning thread/process of arrived IDs. In my original system the listenning process had a socket open with a particular protocol family and waited for N bytes. when the data arrived, the socket returned the buffer ID, followed by N-sizeof(ID) bytes from th header of the packet so that the app could check a header and see if it was interrested. In later version s it used a recvmesg() call and the metadata was in the form of a protocol specific structure received in parallel to the actual data copied. Arrived IDs/buffers were 'owned' by N owners where N was the number of open listenner sockets. each listenner had to respond to the message by 'freeing' the ID if it wan't interrested.. closing the socket freed all IDs still owned by it. closing the file did the same... I forget some of the details. I guess in this version, instead of sockets we could use hooks on the mmap node and we could use ng sockets to connect to them.. The external data 'free' method in th embuf could decrement teh ID reference count and actually free it if it reached 0 (when all parts ahd been transmitted?) The userland process woudl free it immediatly after doing the 'send this' command. the reference counts owned by the mbuffs would stop it from being freed until the packets were sent. In our previous version, we ahd a disk/vfs interface too and there was a "write this to filedescriptor N" and "write this to raw disk X at offset Y" command too.. the disk would own a reference until the data was written of course.. There was also a "read from raw disk X at offsett Y into buffer ID" command. you had to own the buffer already for it to work.. in 1987 we were saturating several ethernets off disk with this with 5% cpu load :-) disk->[dma]->mem->[dma]->ethernet Since machines are now hundreds of times faster (30MHz 68010 with 32 bit mem bus vs 3GHz 64bit bus machine) some of this doesn't make sense any more, but it was an achievement at the time. just an idea.