Date: Thu, 8 May 2008 19:26:47 +0100 From: Tim Gebbett <tim@gebbettco.com> To: Andre Oppermann <andre@freebsd.org> Cc: Tim@gebbettco.com, Deng XueFeng <dengxf@gmail.com>, Mark Hills <mark@pogo.org.uk>, freebsd-net@freebsd.org Subject: Re: read() returns ETIMEDOUT on steady TCP connection Message-ID: <32570632be785826455c493ec4d37206@193.189.140.95> In-Reply-To: <482324F9.7030802@freebsd.org> References: <482324F9.7030802@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Andre, Applied the patch, I could not see anything different to the last test. No packet filtering or NAT are enabled, the test is running over a switch. Many thanks - Tim 258/6657/6915 mbufs in use (current/cache/total) 256/1084/1340/25600 mbuf clusters in use (current/cache/total/max) 256/1024 mbuf+clusters out of packet secondary zone in use (current/cache) 0/3565/3565/51200 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 576K/18092K/18668K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/5/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines tcp: 2535986 packets sent 2331504 data packets (3266105457 bytes) 258369 data packets (371873800 bytes) retransmitted 46685 data packets unnecessarily retransmitted 0 resends initiated by MTU discovery 27972 ack-only packets (3905 delayed) 0 URG only packets 0 window probe packets 10636 window update packets 4 control packets 1888423 packets received 1172093 acks (for 3256670917 bytes) 425512 duplicate acks 0 acks for unsent data 46363 packets (52992771 bytes) received in-sequence 19 completely duplicate packets (17508 bytes) 0 old duplicate packets 0 packets with some dup. data (0 bytes duped) 293 out-of-order packets (376674 bytes) 0 packets (0 bytes) of data after window 0 window probes 242695 window update packets 0 packets received after close 0 discarded for bad checksums 0 discarded for bad header offset fields 0 discarded because packet too short 0 discarded due to memory problems 2 connection requests 2054 connection accepts 0 bad connection attempts 0 listen queue overflows 597 ignored RSTs in the windows 2056 connections established (including accepts) 2052 connections closed (including 2049 drops) 2048 connections updated cached RTT on close 2048 connections updated cached RTT variance on close 2048 connections updated cached ssthresh on close 0 embryonic connections dropped 1172093 segments updated rtt (of 1057825 attempts) 72399 retransmit timeouts 1 connection dropped by rexmit timeout 0 persist timeouts 0 connections dropped by persist timeout 0 Connections (fin_wait_2) dropped because of timeout 0 keepalive timeouts 0 keepalive probes sent 0 connections dropped by keepalive 4806 correct ACK header predictions 43111 correct data packet header predictions 2054 syncache entries added 0 retransmitted 0 dupsyn 0 dropped 2054 completed 0 bucket overflow 0 cache overflow 0 reset 0 stale 0 aborted 0 badack 0 unreach 0 zone failures 2054 cookies sent 0 cookies received 83801 SACK recovery episodes 116632 segment rexmits in SACK recovery episodes 168710294 byte rexmits in SACK recovery episodes 544885 SACK options (SACK blocks) received 14 SACK options (SACK blocks) sent 0 SACK scoreboard overflow udp: 31 datagrams received 0 with incomplete header 0 with bad data length field 0 with bad checksum 0 with no checksum 0 dropped due to no socket 10 broadcast/multicast datagrams undelivered 0 dropped due to full socket buffers 0 not for hashed pcb 21 delivered 22 datagrams output 0 times multicast source filter matched sctp: 0 input packets 0 datagrams 0 packets that had data 0 input SACK chunks 0 input DATA chunks 0 duplicate DATA chunks 0 input HB chunks 0 HB-ACK chunks 0 input ECNE chunks 0 input AUTH chunks 0 chunks missing AUTH 0 invalid HMAC ids received 0 invalid secret ids received 0 auth failed 0 fast path receives all one chunk 0 fast path multi-part data 0 output packets 0 output SACKs 0 output DATA chunks 0 retransmitted DATA chunks 0 fast retransmitted DATA chunks 0 FR's that happened more than once to same chunk. 0 intput HB chunks 0 output ECNE chunks 0 output AUTH chunks 0 ip_output error counter Packet drop statistics: 0 from middle box 0 from end host 0 with data 0 non-data, non-endhost 0 non-endhost, bandwidth rep only 0 not enough for chunk header 0 not enough data to confirm 0 where process_chunk_drop said break 0 failed to find TSN 0 attempt reverse TSN lookup 0 e-host confirms zero-rwnd 0 midbox confirms no space 0 data did not match TSN 0 TSN's marked for Fast Retran Timeouts: 0 iterator timers fired 0 T3 data time outs 0 window probe (T3) timers fired 0 INIT timers fired 0 sack timers fired 0 shutdown timers fired 0 heartbeat timers fired 0 a cookie timeout fired 0 an endpoint changed its cookiesecret 0 PMTU timers fired 0 shutdown ack timers fired 0 shutdown guard timers fired 0 stream reset timers fired 0 early FR timers fired 0 an asconf timer fired 0 auto close timer fired 0 asoc free timers expired 0 inp free timers expired 0 packet shorter than header 0 checksum error 0 no endpoint for port 0 bad v-tag 0 bad SID 0 no memory 0 number of multiple FR in a RTT window 0 RFC813 allowed sending 0 RFC813 does not allow sending 0 max burst dosn't allow sending 0 look ahead tells us no memory in interface 0 numbers of window probes sent 0 times an output error to clamp down on next user send. 0 times sctp_senderrors were caused from a user 0 number of in data drops due to chunk limit reached 0 number of in data drops due to rwnd limit reached 0 times a ECN reduced the cwnd 0 used express lookup via vtag 0 collision in express lookup. 0 times the sender ran dry of user data on primary 0 same for above 0 sacks the slow way 0 window update only sacks sent 0 sends with sinfo_flags !=0 0 unordered sends 0 sends with EOF flag set 0 sends with ABORT flag set 0 times protocol drain called 0 times we did a protocol drain 0 times recv was called with peek 0 cached chunks used 0 cached stream oq's used 0 unread messages abandonded by close 0 send burst avoidance, already max burst inflight to net 0 send cwnd full avoidance, already max burst inflight to net 0 number of map array over-runs via fwd-tsn's ip: 1888457 total packets received 0 bad header checksums 0 with size smaller than minimum 0 with data size < data length 0 with ip length > max ip packet size 0 with header length < data size 0 with data length < header length 0 with bad options 0 with incorrect version number 0 fragments received 0 fragments dropped (dup or out of space) 0 fragments dropped after timeout 0 packets reassembled ok 1888454 packets for this host 0 packets for unknown/unsupported protocol 0 packets forwarded (0 packets fast forwarded) 3 packets not forwardable 0 packets received for unknown multicast group 0 redirects sent 2628572 packets sent from this host 0 packets sent with fabricated ip header 0 output packets dropped due to no bufs, etc. 0 output packets discarded due to no route 0 output datagrams fragmented 0 fragments created 0 datagrams that can't be fragmented 0 tunneling packets that can't find gif 0 datagrams with bad address in header On Thu, 08 May 2008 18:06:17 +0200, Andre Oppermann <andre@freebsd.org> wrote: > Hi Tim, > > looking at the ip_output() path there are some places that can > return ENOBUFS: > > a) interface queue length check > > b) packet filter > > c) destination address rewrite through NAT > > d) if_output() call > > e) IP fragmentation if DF was not set > > The first one of those is the most likely to be the source of the > error. The output interface queue length in read unlocked and may > be a stale value on an SMP machine. Further down in ether_output() > there are some further possibilities for ENOBUFS errors. But lets > concentrate on a) first. > > For testing purposes please apply the following patch to ip_output(): > > ------------------------------------------------------------------- > cvs diff -up ip_output.c > Index: ip_output.c > =================================================================== > RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v > retrieving revision 1.276.2.1 > diff -u -p -r1.276.2.1 ip_output.c > --- ip_output.c 9 Mar 2008 21:04:54 -0000 1.276.2.1 > +++ ip_output.c 8 May 2008 16:02:32 -0000 > @@ -370,7 +370,7 @@ again: > ip->ip_src = IA_SIN(ia)->sin_addr; > } > } > - > +#if 0 > /* > * Verify that we have any chance at all of being able to queue > the > * packet or packet fragments, unless ALTQ is enabled on the > given > @@ -390,7 +390,7 @@ again: > ifp->if_snd.ifq_drops += (ip->ip_len / ifp->if_mtu + 1); > goto bad; > } > - > +#endif > /* > * Look for broadcast address and > * verify user is allowed to send > ------------------------------------------------------------------- > > If there is a real interface output queue full event the IFQ_HANDOFF() > and IFQ_ENQUEUE() macros will report it too. Then we can focus on the > interface queues. > > -- > Andre > > Tim Gebbett wrote: >> Hi all, >> >> applied the patch, >> >> Well before a ETIMEDOUT error occurred (around 60secs), the tcp debug > started venting massive quantities of tcp_output error 55 while sending > with syncache noise: >> >> >> y 8 12:14:26 timtest kernel: :63859 to [192.168.5.40]:80; tcp_output: > error 55 whilTeC Ps:e n[d1i9n2g. 1(6i8p._5o.u4t3p]u:t64 0371 )t >> May 8 12:14:26 timtest kernel: o >> May 8 12:14:26 timtest kernel: [192.168.5.40]:80; tcp_output: error 55 > while sendingT >> May 8 12:14:26 timtest kernel: C >> May 8 12:14:26 timtest kernel: P: [192.168.5.43]:63859 to > [192.168.5.40]:80; tcp_output: error 55 while sending >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to > [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to > [192.168.5.40]:80; tcp_output: error 55 while sending >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to > [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.42]:56421 toT C[P1:9 > 2[.119628..51.6480.]5:.8403;] :6t3c8p57_ otuot p[u1t9:2 .e1r68r.o5r. > 40]:8505; whticlpe_ osuetnpduitn:g e(rirpo_ro utpu5t5 w1h)i >> >> interspersed with clean blocks of 20 entries or so of: >> >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to > [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to > [192.168.5.40]:80; tcp_output: error 55 while sending >> May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to > [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) >> >> >> The output did not look appreciably different when the ETIMEDOUT > occurred. >> >> On stopping the client test program: >> >> May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to > [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without > matching syncache entry (possibly syncookie only), segment ignored >> May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to > [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without > matching syncache entry (possibly syncookie only), segment ignored >> May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to > [192.168.5.40]:80 tcpflags 0x4<RST>; syncache_chkrst: Spurious RST without > matching syncache entry (possibly syncookie only), segment ignored >> >> netstat -m >> >> 258/11007/11265 mbufs in use (current/cache/total) >> 256/1596/1852/25600 mbuf clusters in use (current/cache/total/max) >> 256/1536 mbuf+clusters out of packet secondary zone in use > (current/cache) >> 0/7585/7585/51200 4k (page size) jumbo clusters in use > (current/cache/total/max) >> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) >> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) >> 576K/36283K/36860K bytes allocated to network (current/cache/total) >> 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) >> 0/0/0 requests for jumbo clusters denied (4k/9k/16k) >> 0/4/6656 sfbufs in use (current/peak/max) >> 0 requests for sfbufs denied >> 0 requests for sfbufs delayed >> 0 requests for I/O initiated by sendfile >> 0 calls to protocol drain routines >> >> Thanks again for your help - Tim >> >> >> >> >> >> >> >> >> > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?32570632be785826455c493ec4d37206>