From owner-freebsd-net@FreeBSD.ORG Thu May 8 16:06:15 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1A605106566C for ; Thu, 8 May 2008 16:06:15 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 8DFA58FC41 for ; Thu, 8 May 2008 16:06:14 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 76578 invoked from network); 8 May 2008 15:07:51 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 8 May 2008 15:07:51 -0000 Message-ID: <482324F9.7030802@freebsd.org> Date: Thu, 08 May 2008 18:06:17 +0200 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.14 (Windows/20071210) MIME-Version: 1.0 To: Tim@gebbettco.com References: <4822BABB.4020407@freebsd.org> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Deng XueFeng , Mark Hills Subject: Re: read() returns ETIMEDOUT on steady TCP connection X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 08 May 2008 16:06:15 -0000 Hi Tim, looking at the ip_output() path there are some places that can return ENOBUFS: a) interface queue length check b) packet filter c) destination address rewrite through NAT d) if_output() call e) IP fragmentation if DF was not set The first one of those is the most likely to be the source of the error. The output interface queue length in read unlocked and may be a stale value on an SMP machine. Further down in ether_output() there are some further possibilities for ENOBUFS errors. But lets concentrate on a) first. For testing purposes please apply the following patch to ip_output(): ------------------------------------------------------------------- cvs diff -up ip_output.c Index: ip_output.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_output.c,v retrieving revision 1.276.2.1 diff -u -p -r1.276.2.1 ip_output.c --- ip_output.c 9 Mar 2008 21:04:54 -0000 1.276.2.1 +++ ip_output.c 8 May 2008 16:02:32 -0000 @@ -370,7 +370,7 @@ again: ip->ip_src = IA_SIN(ia)->sin_addr; } } - +#if 0 /* * Verify that we have any chance at all of being able to queue the * packet or packet fragments, unless ALTQ is enabled on the given @@ -390,7 +390,7 @@ again: ifp->if_snd.ifq_drops += (ip->ip_len / ifp->if_mtu + 1); goto bad; } - +#endif /* * Look for broadcast address and * verify user is allowed to send ------------------------------------------------------------------- If there is a real interface output queue full event the IFQ_HANDOFF() and IFQ_ENQUEUE() macros will report it too. Then we can focus on the interface queues. -- Andre Tim Gebbett wrote: > Hi all, > > applied the patch, > > Well before a ETIMEDOUT error occurred (around 60secs), the tcp debug started venting massive quantities of tcp_output error 55 while sending with syncache noise: > > > y 8 12:14:26 timtest kernel: :63859 to [192.168.5.40]:80; tcp_output: error 55 whilTeC Ps:e n[d1i9n2g. 1(6i8p._5o.u4t3p]u:t64 0371 )t > May 8 12:14:26 timtest kernel: o > May 8 12:14:26 timtest kernel: [192.168.5.40]:80; tcp_output: error 55 while sendingT > May 8 12:14:26 timtest kernel: C > May 8 12:14:26 timtest kernel: P: [192.168.5.43]:63859 to [192.168.5.40]:80; tcp_output: error 55 while sending > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to [192.168.5.40]:80; tcp_output: error 55 while sending > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) > May 8 12:14:26 timtest kernel: TCP: [192.168.5.42]:56421 toT C[P1:9 2[.119628..51.6480.]5:.8403;] :6t3c8p57_ otuot p[u1t9:2 .e1r68r.o5r. 40]:8505; whticlpe_ osuetnpduitn:g e(rirpo_ro utpu5t5 w1h)i > > interspersed with clean blocks of 20 entries or so of: > > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:64037 to [192.168.5.40]:80; tcp_output: error 55 while sending > May 8 12:14:26 timtest kernel: TCP: [192.168.5.43]:63857 to [192.168.5.40]:80; tcp_output: error 55 while sending (ip_output 1) > > > The output did not look appreciably different when the ETIMEDOUT occurred. > > On stopping the client test program: > > May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to [192.168.5.40]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored > May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to [192.168.5.40]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored > May 8 12:14:46 timtest kernel: TCP: [192.168.5.43]:63978 to [192.168.5.40]:80 tcpflags 0x4; syncache_chkrst: Spurious RST without matching syncache entry (possibly syncookie only), segment ignored > > netstat -m > > 258/11007/11265 mbufs in use (current/cache/total) > 256/1596/1852/25600 mbuf clusters in use (current/cache/total/max) > 256/1536 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/7585/7585/51200 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 576K/36283K/36860K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/4/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > Thanks again for your help - Tim > > > > > > > > >