Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Mar 2012 20:48:42 GMT
From:      Sergey Smitienko <hunter@comsys.com.ua>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/166501: FreeBSD 9.0 generates incorrect SEC/ACK numbers under load
Message-ID:  <201203292048.q2TKmgrD073086@red.freebsd.org>
Resent-Message-ID: <201203292050.q2TKoA9m092296@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         166501
>Category:       kern
>Synopsis:       FreeBSD 9.0 generates incorrect SEC/ACK numbers under load
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Thu Mar 29 20:50:09 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Sergey Smitienko
>Release:        FreeBSD 9.0-RELEASE amd64
>Organization:
Ice technologies ltd
>Environment:
FreeBSD xxx.xxx.xx 9.0-RELEASE FreeBSD 9.0-RELEASE #2: Tue Mar 27 23:46:50 EEST 2012 root@xxx.xxx.xx:/usr/obj/usr/src/sys/IPSEC amd64
>Description:
I've run into a problem with a web server runing FreeBSD 9.0/amd64. What
I believe is happening, is what server loses track of correct SEQ/ACK
numberson some connections. Here is an example:

15:20:00.347514 IP (tos 0x68, ttl 123, id 1181, offset 0, flags [DF],
proto TCP (6), length 52)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [S], cksum 0x6995
(correct), seq 3881466934, win 8192, options [mss 1460,nop,wscale
2,nop,nop,sackOK], length 0
15:20:00.347526 IP (tos 0x10, ttl 254, id 28065, offset 0, flags [DF],
proto TCP (6), length 44)
    193.178.147.113.80 > 93.72.14.220.49239: Flags [S.], cksum 0x79fa
(correct), seq 2151790680, ack 3881466935, win 0, options [mss 1460],
length 0
15:20:00.361812 IP (tos 0x68, ttl 123, id 1183, offset 0, flags [DF],
proto TCP (6), length 40)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x96c6
(correct), seq 3881466935, ack 2151790681, win 64240, length 0
15:20:00.361869 IP (tos 0x10, ttl 254, id 31305, offset 0, flags [DF],
proto TCP (6), length 40)
    193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x71b7
(correct), seq 2151790681, ack 3881466935, win 8192, length 0

Client sends "GET"  request
15:20:48.236181 IP (tos 0x68, ttl 123, id 1353, offset 0, flags [DF],
proto TCP (6), length 626)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [P.], cksum 0x7fc9
(correct), seq 3881466935:3881467521, ack 2151790681, win 64240, length 586

and then the "ping-pong" starts:

15:20:48.236198 IP (tos 0x0, ttl 254, id 63530, offset 0, flags [DF],
proto TCP (6), length 40)
    193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.255998 IP (tos 0x68, ttl 123, id 1357, offset 0, flags [DF],
proto TCP (6), length 40)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0
15:20:48.256015 IP (tos 0x0, ttl 254, id 53518, offset 0, flags [DF],
proto TCP (6), length 40)
    193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.276084 IP (tos 0x68, ttl 123, id 1360, offset 0, flags [DF],
proto TCP (6), length 40)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0
15:20:48.276099 IP (tos 0x0, ttl 254, id 42983, offset 0, flags [DF],
proto TCP (6), length 40)
    193.178.147.113.80 > 93.72.14.220.49239: Flags [.], cksum 0x8a97
(correct), seq 2991748588, ack 1985077892, win 8760, length 0
15:20:48.290914 IP (tos 0x68, ttl 123, id 1361, offset 0, flags [DF],
proto TCP (6), length 40)
    93.72.14.220.49239 > 193.178.147.113.80: Flags [.], cksum 0x947c
(correct), seq 3881467521, ack 2151790681, win 64240, length 0

This happens on about 0.01% of connections. This tcpdump is recorded on
the 193.178.147.113, before traffic hits the wire.
So it's not a NIC fault. Server is running nginx and serving static
content 200-500 request  per second.
>How-To-Repeat:
make.conf:

CPUTYPE?=nocona
CFLAGS=-O2 -pipe -fno-strict-aliasing
COPTFLAGS=-O2 -pipe -funroll-loops -ffast-math -fno-strict-aliasing
KERNCONF=IPSEC
OPTIMIZED_CFLAGS=YES
WITHOUT_X11=YES
BUILD_OPTIMIZED=YES
WITH_CPUFLAGS=YES
WITH_OPTIMIZED_CFLAGS=YES

Kernel is generic with 
options   IPSEC        #IP security
device    crypto

/boot/loader.conf:
net.inet.tcp.tcbhashsize=8192
net.inet.tcp.syncache.hashsize=1024
net.inet.tcp.syncache.bucketlimit=100

/etc/sysctl.conf:

kern.maxvnodes=100000

net.inet.ip.random_id=1
net.inet.ip.portrange.first=10240
net.inet.ip.portrange.last=65535
net.inet.ip.ttl=254

net.inet.tcp.maxtcptw=102400
net.inet.tcp.syncookies=1 
net.inet.tcp.mssdflt=1024

net.inet.icmp.drop_redirect=1
net.inet.icmp.icmplim=100
net.inet.icmp.log_redirect=0
net.inet.icmp.maskrepl=0
net.inet.icmp.icmplim_output=0
net.inet.ip.accept_sourceroute=0

kern.ipc.somaxconn=4096
kern.maxfiles=524288
kern.maxfilesperproc=524288
kern.ipc.maxsockets=524288
kern.ipc.nmbclusters=204800
net.inet.tcp.recvspace=8192
net.inet.tcp.recvbuf_auto=0
net.inet.tcp.sendspace=16384
net.inet.tcp.sendbuf_max=65536
net.inet.tcp.sendbuf_inc=8192
net.inet.tcp.sendbuf_auto=1
kern.ipc.nmbjumbop=192000
kern.ipc.shmall=1048576

>Fix:
n/a

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201203292048.q2TKmgrD073086>