Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Nov 2007 10:56:18 +0100 (CET)
From:      Oliver Fromme <olli@lurza.secnetix.de>
To:        freebsd-current@FreeBSD.ORG, pyunyh@gmail.com
Subject:   Re: Sockets stuck in SYN_RCVD (re(4), RELENG_7, i386)
Message-ID:  <200711210956.lAL9uI0o097057@lurza.secnetix.de>
In-Reply-To: <20071121011120.GB13817@cdnetworks.co.kr>

next in thread | previous in thread | raw e-mail | index | archive | help
Pyun YongHyeon wrote:
 > On Tue, Nov 20, 2007 at 04:19:18PM +0100, Oliver Fromme wrote:
 > > Some additional information.
 > > 
 > > Today I have run the re(4) interface at 100 Mbps for a few
 > > hours.  The count did still increase, so it's not a GigE-
 > > only problem.
 > > 
 > > The I disabled RXCSUM,TXCSUM on the interface.  Again, the
 > > counter still increased.  So hardware checksumming isn't
 > > the cause of the problem either.
 > > 
 > > Anything else I could try?
 > 
 > re(4) is not smart enough to analyze packet payload. The hardware
 > also doesn't have a feature like TCP header split so I think re(4)
 > wouldn't have influence with TCP traffics by itself.

I see.  So it does not seem to be a bug in re(4).

My first suspect were the IPFW rules.  But they're quite
simple (only 20 rules) and I'm sure they're correct.
Apart from that, if it was a faulty rule that blocks
SYN+ACK packets or similar, then no TCP connections would
work at all.  And even in that case, the default timeout
for SYN_RCVD is very short (45 seconds I think), but not
several days.

So my current suspect is a bug in the syncache code.
That bug is probably triggered by something exceptional,
because I don't see the problem on any other machine,
not even on the one which is almost identical in hardware
and OS.

I would like to ask everybody to have a look at the
output from "sysctl net.inet.tcp.syncache.count".
Does anybody else have a non-zero value that slowly
increases?  If so, it would be interesting to find out
if there are any similarities with my machine.

 > Your dmesg indicates that you're using slightly old rgephy(4) on 7.0.
 > I touched rgephy(4) to support a newer PHY and fixed several bugs. If
 > speed/duplex mismatch was the cause of the issue you can see lots
 > of input errors from the output of "netstat -ndi" output. If so, try
 > latest rgephy(4).

I don't think that's the cause.  I tried with and without
auto-select, forcing the interface to 100 and GigE, and
all of that did not affect the behaviour at all.  The
error counters are all zero:

Name  Mtu Network  Address    Ipkts Ierrs    Opkts Oerrs  Coll Drop
re0  1500 <Link#1> [...]   28363007     0 25430349     0     0    0

 > > > net.inet.tcp.syncache.count: 702
 > > 
 > > It's now at 731.

And now at 832.  So it grows by more than 100 entries per
day.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

 > Can the denizens of this group enlighten me about what the
 > advantages of Python are, versus Perl ?
"python" is more likely to pass unharmed through your spelling
checker than "perl".
        -- An unknown poster and Fredrik Lundh



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200711210956.lAL9uI0o097057>