From owner-freebsd-hackers@FreeBSD.ORG Sat Apr 12 03:37:30 2003 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 615FA37B401; Sat, 12 Apr 2003 03:37:30 -0700 (PDT) Received: from porter.dc.luth.se (dh249.unimaster.se [193.11.24.249]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6CC9043FBD; Sat, 12 Apr 2003 03:37:28 -0700 (PDT) (envelope-from bj@dc.luth.se) Received: from porter.dc.luth.se (localhost.dc.luth.se [127.0.0.1]) by porter.dc.luth.se (Postfix) with ESMTP id 8A6963B3; Sat, 12 Apr 2003 12:37:25 +0200 (CEST) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert In-reply-to: Your message of Fri, 11 Apr 2003 15:07:59 PDT. <3E973CBF.FB552960@mindspring.com> Dcc: X-Disposition-notification-to: Borje.Josefsson@dc.luth.se X-uri: http://www.dc.luth.se/~bj/index.html Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Sat, 12 Apr 2003 12:37:25 +0200 From: Borje Josefsson Message-Id: <20030412103725.8A6963B3@porter.dc.luth.se> cc: freebsd-hackers@freebsd.org cc: freebsd-performance@freebsd.org cc: "Jin Guojun \[DSD\]" cc: Eric Anderson cc: David Gilbert cc: Anders Ragge Magnusson Subject: Re: tcp_output starving -- is due to mbuf get delay? X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: bj@dc.luth.se List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 12 Apr 2003 10:37:30 -0000 On Fri, 11 Apr 2003 15:07:59 PDT Terry Lambert wrote: > Borje Josefsson wrote: > > I did a quick test with some combination of the OID:s You sent, excep= t I > > didn't reboot between each test: > = > The reboots was intended to keep the statistics counters relatively > accurate between the FreeBSD and NteBSD sender runs. By doing that, > you can tell if what's happening on the receiver is the same for > both sender machines. If you don't reboot, then the statistics are > polluted with other traffic, and can't be compared. OK. I found out the -z flag to netstat, that clears the counters. Unfortu= nally NetBSD lacks this flag, so I rebooted that host several times :-( J= ust to be I didn't have anything "old" lying around, I rebooted the FreeBSD host before I started. > You should also start clean on each sender, and get the same stats > on the sender. I would add "vmstat -i", to look at interrupt overhead.= > = > Note that FreeBSD jumbograms are in external mbufs allocated to the > cards on receive. On transmit, they are scatter/gathered. NetBSD > might not have this overhead. The copy overhead there could account > for a lot of CPU time. I think NetBSD-current claims to do zero-copy transfers. I added Anders Magnusson to the CC: of tis mail, he knows very much of NetBSD networking internals. He surely can fill in some more details on this. > > Netstat -m (tcp and ip portion) when I started and after the trials: > = > Side-by-side/interleaved is more useful. I will do it manually for > the ones that change; if we continue this discussion, you get to do > the work in the future (b=3Dbefore, A=3Dafter) > b> 0 resends initiated by MTU discovery > b> 6446084 ack-only packets (199 delayed) > A> 6446155 ack-only packets (207 delayed) > 71 8 > = > This is odd. You must be sending data in both directions. Thus > the lower bandwidth could be the result of negotiated options; you > may want to try turning _on_ rfc1644. Did that. No difference in performance. = > The delayed ACKs are bad. Can you either set "PUSH" on the socket, > or turn off delayed ACK entirely? Did that (tcp.delayed_ack=3D0). No apparent difference. > All in all, there's not a lot of weird stuff going on; now you need > to look at the NetBSD vs. the FreeBSD transmitters, in a similar > way, get the deltas for both, and then compare them to each other. > = > A really important thing to look at is the "vmstat -i" I asked for > earlier, in order to get interrupt counts on the transmitter. Most > likely, there is a driver difference causing the "problem"; you > should be able to see this in a differential for the transmit > interrupt overhead being higher on the FreeBSD box. > = > It would also be very interesting to compare the netstat numbsrs > between the transmitters, as suggested above; the numbers should > tell you about differences in implemntation on the driver side. OK, here goes, as a first attempt to match sender and receiver data. Appologies for the long lines - I have tried to "match" appropiate sender and receiver lines below. *Note* that there are no "before" = and "after" in the netstat figures, this is net values accumulated during the test. In some cases there might be some odd packets that doesn't have to do with my ttcp test (since I access the hosts remotely), but I ran everything from a shell script to file, so the difference should me minor. I'll await comments on the data below before doing something more. --B=F6rje sender=3DFreeBSD receiver=3DNetBSD ***tcp: 305178 packets sent 305179 received 305175 data packets (1249996800 bytes) 305176 packets (1249996800 bytes)= in seq. 0 data packets (0 bytes) retransmitted 0 resends initiated by MTU discovery 1 ack-only packet (0 delayed) 0 URG only packets 0 window probe packets 0 window update packets 0 window update packets received 2 control packets 205911 packets received 206052 sent 168215 acks (for 1249148976 bytes) 136850 ack-only packets (168328 d= elayed) sent 0 duplicate acks 0 acks for unsent data 0 packets (0 bytes) received in-seq 0 completely duplicate packets 0 old duplicate packets 0 packets with some dup. data 0 out-of-order packets (0 bytes) 0 packets of data after window 0 window probes 37696 window update packets 69201 window update packets sent 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset f. 0 discarded because packet too short 168215 segments updated rtt (of 59609) 1 segments updated rtt (of 1 atte= mpts) 9795 correct ACK header predictions 1 correct ACK header predictions 0 correct data packet header predict. 305175 correct data packet header= predict. ***ip: 205915 total packets received 206052 packets sent from this hos= t 305179 packets sent from this host 305185 packets for this host vmstat -i on *sender* =3D=3D=3Dbefore=3D=3D=3D =3D=3D=3Dafte= r=3D=3D=3D interrupt total rate total rate ata0 irq14 4 0 4 0 bge1 irq7 48 0 48 0 mux irq11 372597 325 459967 396 mux irq10 15 0 15 0 fdc0 irq6 2 0 2 0 atkbd0 irq1 1 0 1 0 clk irq0 114364 99 115893 99 rtc irq8 146388 127 148346 127 Total 633419 553 724276 624 vmstat -i on *receiver* =3D=3Dbefore=3D=3D =3D=3Dafter=3D=3D= interrupt total rate total rate cpu0 softclock 16738 99 18687 99 cpu0 softnet 170 1 89848 480 cpu0 softserial 1 0 1 0 pic0 pin 11 264 1 90106 481 pic0 pin 14 1528 9 1564 8 pic0 pin 3 1 0 1 0 pic0 pin 0 16910 100 18831 100 Total 35612 211 219038 1171 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D sender=3DNetBSD receiver=3DFreeBSD *** tcp: 282935 packets sent 282936 packets received 282933 data packets (1249996800 bytes) 282933 packets (1249996800 byt= es) received in-sequ 0 data packets (0 bytes) retransmitted 1 ack-only packets (32 delayed) 2 acks (for 49 bytes) received= 0 window probe packets 0 window update packet 1 control packet 0 send attempts resulted in self-quench 187507 packets received 187947 packets sent 187131 acks (for 1247077744 bytes) 95364 ack-only packets (0 dela= yed) sent 0 duplicate acks 0 acks for unsent data 0 packets (0 bytes) received in-sequence 0 completely duplicate packets (0 bytes) 0 old duplicate packets 0 packets with some dup. data 0 out-of-order packets (0 bytes) 0 packets (0 bytes) of data after window 0 window probes 374 window update packets 92582 window update packets se= nt 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 1 connection request 0 connection accept 1 connections established (incl. accepts) 1 connection established (incl= uding accepts) 0 connection closed (including 0 drops) 0 embryonic connections dropped 182455 segments updated rtt (of 78677) 2 segments updated rtt (of 1 a= ttempt) 0 retransmit timeouts 0 connections dropped by rexmit timeout 0 persist timeouts 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive 14 correct ACK header predictions 1 correct ACK header predictio= n 282931 correct data packet hea= der predictions 0 correct data packet header pred. 0 PCB hash misses 0 dropped due to no socket 0 connections drained due to memory shortage 0 PMTUD blackholes detected 0 bad connection attempts 0 SYN cache entries added 0 hash collisions 0 completed 0 aborted (no space to build PCB) 0 timed out 0 dropped due to overflow 0 dropped due to bucket overflow 0 dropped due to RST 0 dropped due to ICMP unreachable 0 SYN,ACKs retransmitted 0 duplicate SYNs received for entries already in the cache 0 SYNs dropped (no route or no space) *** ip: 187503 total packets received 0 bad header checksums 0 bad header checksums 0 with size smaller than minimum 0 with size smaller than minim= um 0 with data size < data length 0 with data size < data length= 0 with length > max ip packet size 0 with ip length > max ip pack= et size 0 with header length < data size 0 with header length < data si= ze 0 with data length < header length 0 with data length < header le= ngth 0 with bad options 0 with bad options 0 with incorrect version number 0 with incorrect version numbe= r 0 fragments received 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped (dup or ou= t of space) 0 malformed fragments dropped 0 fragments dropped after timeout 0 fragments dropped after time= out 0 packets reassembled ok 0 packets reassembled ok 187503 packets for this host 187947 packets sent from this = host 0 packets for unknown/unsupported protocol 0 packets forwarded 0 packets not forwardable 0 redirects sent 282936 packets sent from this host 282936 total packets received 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 output datagrams fragmented 0 fragments created 0 fragments created 0 datagrams that can't be fragmented 0 datagrams that can't be frag= mented 0 datagrams with bad address in header 0 datagrams with bad address i= n header vmstat -i on *sender* =3D=3Dbefore=3D=3D=3D =3D=3D=3D after= =3D=3D=3D interrupt total rate total rate cpu0 softclock 4737 98 5777 99 cpu0 softnet 79 1 41426 714 cpu0 softserial 1 0 1 0 pic0 pin 11 146 3 41777 720 pic0 pin 14 1516 31 1537 26 pic0 pin 3 1 0 1 0 pic0 pin 0 4905 102 5928 102 Total 11385 237 96447 1662 vmstat -i on *receiver* =3D=3D=3D before=3D=3D=3D =3D=3D=3D af= ter =3D=3D=3D interrupt total rate total rate ata0 irq14 4 0 4 0 bge1 irq7 48 0 48 0 mux irq11 744037 564 1027879 771 mux irq10 15 0 15 0 fdc0 irq6 2 0 2 0 atkbd0 irq1 1 0 1 0 clk irq0 131831 100 133175 99 rtc irq8 168746 128 170467 127 Total 1044684 792 1331591 999