From owner-freebsd-current@FreeBSD.ORG Fri Sep 29 17:30:26 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8C6F116A5C1; Fri, 29 Sep 2006 17:30:26 +0000 (UTC) (envelope-from rrs@cisco.com) Received: from sj-iport-6.cisco.com (sj-iport-6.cisco.com [171.71.176.117]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3D47F43D6B; Fri, 29 Sep 2006 17:30:18 +0000 (GMT) (envelope-from rrs@cisco.com) Received: from sj-dkim-4.cisco.com ([171.71.179.196]) by sj-iport-6.cisco.com with ESMTP; 29 Sep 2006 10:30:18 -0700 Received: from sj-core-1.cisco.com (sj-core-1.cisco.com [171.71.177.237]) by sj-dkim-4.cisco.com (8.12.11.20060308/8.12.11) with ESMTP id k8THUIYr000491; Fri, 29 Sep 2006 10:30:18 -0700 Received: from xbh-sjc-231.amer.cisco.com (xbh-sjc-231.cisco.com [128.107.191.100]) by sj-core-1.cisco.com (8.12.10/8.12.6) with ESMTP id k8THUHQX018464; Fri, 29 Sep 2006 10:30:17 -0700 (PDT) Received: from xfe-sjc-211.amer.cisco.com ([171.70.151.174]) by xbh-sjc-231.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.211); Fri, 29 Sep 2006 10:30:17 -0700 Received: from [127.0.0.1] ([171.68.225.134]) by xfe-sjc-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Fri, 29 Sep 2006 10:30:16 -0700 Message-ID: <451D5801.8030504@cisco.com> Date: Fri, 29 Sep 2006 13:29:37 -0400 From: Randall Stewart User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050920 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Peter Lei , Michael Tuexen Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 29 Sep 2006 17:30:16.0676 (UTC) FILETIME=[EDB92A40:01C6E3EC] DKIM-Signature: a=rsa-sha1; q=dns; l=5220; t=1159551018; x=1160415018; c=relaxed/simple; s=sjdkim4002; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=rrs@cisco.com; z=From:Randall=20Stewart=20 |Subject:Some=20interesting=20plots; X=v=3Dcisco.com=3B=20h=3DiSgebo/BaR1EeNZmJoamL6mtx20=3D; b=fMT0u4OYvChim9cv9tl+b0sRZ34FOdguB5Vty9fkWzJylBkOW+pz9Or1mxpWXA1ofy5INPEr YB2Qs9SwvI59p3cqpcoFzW6VfdY8DDff78MgRWD+Y0StGNTrjLMzLqZD; Authentication-Results: sj-dkim-4.cisco.com; header.From=rrs@cisco.com; dkim=pass ( sig from cisco.com verified; ); Cc: "George V. Neville-Neil" , freebsd-current@freebsd.org, Robert Watson Subject: Some interesting plots X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Sep 2006 17:30:26 -0000 All: As you all may know I have been working on getting SCTP into Current.. I have been, of late, trying to tweak things to get the BEST performance out of the implementation... I have moved my testing off between two 7.0 machines.. same code base on each updated Sep 25. One is a 2.8Gig Dell SC1600 Xeon.. (hyper threaded CPU). The other is a P4D (2.8Gig .. slightly faster, true dual processor machine). They are connected by two intel EM server cards like so: +----+ +----+ 1 | em1 <---------------------> em0 | 2 | em0 <-----locallan--------> msk0| | dc0 <-Direct Inet | +----+ +-----+ em1 has 10.1.2.12 em0 10.1.2.21 em0 has 10.1.1.12 msk0 10.1.1.21 Now I setup the association with a server on 1, the conversation goes something like this: gettimeofday() <---------------Send me N bytes in X byte records---- ------>send----> ------>send----> when N bytes are recieved gettimeofday()/close() ... close() calculate and announce b/w Now the SCTP implemenation has some interesting logging things that allow me to turn on various compile options that log into a 80,000 entry memory buffer that you can pull with specific commands. I have rigged the sender to log each sack in and tell on the sender side (1): a) How much data is queued to the peer b) Total flight size c) rwnd advertised d) the calulated rwnd. On the receiver side (2) on both exit and entry to the receiving function in the kernel: On entry: a) read request size (uio_resid) b) amount in queue on the sb to read c) if blocking is allowed d) How much has to be read total to get a window update sack out On exit: a) how much has been read since the last sack sent b) how much is left un-read c) the rwnd we are advertising d) how much is still on the sb. ------------------------------------------------------------ The logging is nicely light weight (as long as you don't roll the log). This is because I use some atomic functions to move forward so I have picked a size that won't roll it... I have then played a bit with GNU plot to generate some graphs of all this... fun fun ;-D Now I have placed the plots at: http://www.sctp.org/plots.tar.bz2 (they are about 2Meg.. and I have a 512k up link .. so patience is a must). So what will you see.. plots that say *plotall.ps give you the entire transfer. ones that say plot*.ps show everything: a) When things are read in red y axis is size unread b) what the sender has in flight in green c) what the a-rwnd is in blue d) send-q size in pink. There is always a plot for each 100ms so you can "zoom" in on things... Especially 1.4-1.5 seconds look interesting.. Now plots that have cntplot* Show a) flight b) arwnd c) outq and d) number of sacks every 400us (multiplied by 1000 so if you had 10 sacks in a 400us period you would show it at 40000). This too is interesting. Finally plots that have lcntplot* show only the flight size vs the number of sack's every 500us (multiplied by 1000)... The interesting thing that I have going on is: 1) The sender is running dry.. you can see this in the graphs when the flight crashes down to almost nothing at say 1.5 seconds or so... also seen whenever the blue line moves down to the green flight.. 2) Any time the send queue (blue line) is at the flight (green) this is bad since it means we have the danger of running dry. 3) The reader is completely keeping up... since it reads constantly and never has much of a sb_cc (the red line stays on the bottom of the page). Now whats strange about this is that I am only getting 760Megbits per second (or so).. and in that 760Mb I see processor <1> about running 60-70% idle.. and processor <2> running 70-80% idle. Now I am most puzzeled on how to make this run faster... a) what causes the spikes (ideas would be welcome) b) why can't I get more CPU.. I have seperate locks that the receiver and the sender use to make it so there is minimal contention between the network side of things and the queuing and or reading... In fact I just added the sender seperate lock in the last day or so.. This is because I was seeing not 2 or so but 8 dips before I did that in the "queue" One other note, I see TCP is only getting 250Meg or so on the same test (It can run either).. now it used to get close to the full pipe (Gigbit).. so is there some issue with the new code that was recently submitted? Have a look.. its interesting.. I am rather clueless though on how to grab that CPU that top tells me is there (if its accurate :-D) suggestions on where to look or tuning I could do would be welcome. Thanks R -- Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 815-342-5222 (cell)