Date: Thu, 29 May 2008 14:32:43 -0700 (PDT) From: Matthew Dillon <dillon@apollo.backplane.com> To: Robert Blayzor <rblayzor.bulk@inoc.net> Cc: freebsd-stable@freebsd.org Subject: Re: Sockets stuck in FIN_WAIT_1 Message-ID: <200805292132.m4TLWhCv026720@apollo.backplane.com> References: <B42F9BDF-1E00-45FF-BD88-5A07B5B553DC@inoc.net> <1A19ABA2-61CD-4D92-A08D-5D9650D69768@mac.com> <23C02C8B-281A-4ABD-8144-3E25E36EDAB4@inoc.net> <483DE2E0.90003@FreeBSD.org> <B775700E-7494-42C1-A9B2-A600CE176ACB@inoc.net> <483E36CE.3060400@FreeBSD.org> <483E3C26.3060103@paradise.net.nz> <483E4657.9060906@FreeBSD.org> <483EA513.4070409@earthlink.net> <96AFE8D3-7EAC-4A4A-8EFF-35A5DCEC6426@inoc.net> <483EAED1.2050404@FreeBSD.org> <200805291912.m4TJCG56025525@apollo.backplane.com> <14DA211A-A9C5-483A-8CB9-886E5B19A840@inoc.net> <200805291930.m4TJUeGX025815@apollo.backplane.com> <0C827F66-09CE-476D-86E9-146AB255926B@inoc.net>
next in thread | previous in thread | raw e-mail | index | archive | help
:I think we're onto something here, but for some reason it doesn't make :any sense. I have keepalives turned OFF in Apache: : :When I tcpdump this, I see something sending ack's back and forth :every 60 seconds, but what? Apache? I'm not sure why. I don't see :any timeouts in Apache for ~60 seconds. As you can see, sometimes we :send an ack, but never see a reply. I'm gathering the OS level :keepalives don't come into play because this session is not considered :idle? : : :0:13:07.640426 IP 1.1.1.1.80 > 2.2.2.2.33379: . :4208136508:4208136509(1) ack 1471446041 win 520 <nop,nop,timestamp :3019088951 5004131> :20:13:07.736505 IP 2.2.2.2.33379 > 1.1.1.1.80: . ack 0 win 0 :<nop,nop,timestamp 5022148 3019088951> :20:14:07.702647 IP 1.1.1.1.80 > 2.2.2.2.33379: . 0:1(1) ack 1 win 520 :<nop,nop,timestamp 3019148951 5022148> :20:15:07.764920 IP 1.1.1.1.80 > 2.2.2.2.33379: . 0:1(1) ack 1 win 520 :<nop,nop,timestamp 3019208951 5022148> :20:15:07.860988 IP 2.2.2.2.33379 > 1.1.1.1.80: . ack 0 win 0 :<nop,nop,timestamp 5058183 3019208951> :20:16:07.827262 IP 1.1.1.1.80 > 2.2.2.2.33379: . 0:1(1) ack 1 win 520 :... Yah, the connection is valid so keepalives do not come into play. What is happening is that 1.1.1.1 wants to send something to 2.2.2.2, but 2.2.2.2 is telling 1.1.1.1 that it has no buffer space (win 0). This forces the TCP stack on 1.1.1.1 (the kernel, not the apache server) to 'probe' the connection, which it appears to be doing once a minute. It is probing the connection waiting for 2.2.2.2 to tell it that buffer space is available (win != 0). The connection remains valid because 2.2.2.2 continues to respond to the probes. Now, the connection is also in a half-closed state, which means that one direction is closed. I can't tell which direction that is but my guess is that 1.1.1.1 (the apache server) closed the 1.1.1.1->2.2.2.2 direction and the 2.2.2.2 box has a broken TCP implementation and can't deal with it. :I'm finding several of these sessions doing the same exact thing.... : :-- :Robert Blayzor, BOFH :INOC, LLC I can suggest two things. First, the TCP connection is good but you still may be able to tell Apache, in the apache configuration file, to timeout after a certain period of time and clear the connection. Secondly, it may be beneficial to identify exactly what the client and server were talking about which caused the client to hang with a live tcp connection. The only way to do that is to tcpdump EVERYTHING going on related to the apache srever, save it to a big-ass disk partition (like 500G), and then when you see a stuck connection go back through the tcpdump log file and locate it, grep it out, and review what exactly it was talking about. You'd have to tcpdump with options to tell it to dump the TCP data payloads. It seems likely that the client is running an applet or javascript that receives a stream over the connection, and that applet or javascript program has locked up, causing the data sent from the server to build up and for the client's buffer space to run out, and start advertising the 0 window. -Matt Matthew Dillon <dillon@backplane.com>
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200805292132.m4TLWhCv026720>