Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 2 Nov 2012 23:29:03 GMT
From:      Fabian Keil <fk@fabiankeil.de>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/173309: TCP connections often prematurely closed by the server side after r242262
Message-ID:  <201211022329.qA2NT3ua049380@red.freebsd.org>
Resent-Message-ID: <201211022330.qA2NU1R9081635@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         173309
>Category:       kern
>Synopsis:       TCP connections often prematurely closed by the server side after r242262
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Nov 02 23:30:01 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator:     Fabian Keil
>Release:        HEAD
>Organization:
>Environment:
FreeBSD r500.local 10.0-CURRENT FreeBSD 10.0-CURRENT #501 r+c526bdf: Fri Nov  2 17:47:51 CET 2012     fk@r500.local:/usr/obj/usr/src/sys/ZOEY  amd64

>Description:
With a kernel from 2012-11-02, Privoxy's regression tests no longer
work reliably. While sending an HTTP response body that is 131111
bytes long, write() frequently returns -1 before the whole response
has been written. errno is 1.

If Privoxy sends the response in chunks of 10000 bytes, the failure
most of the time happens after successfully writing 110000 bytes,
or not at all.

The chunk whose delivery fails from Privoxy's point of view is successfully
received by the client, only the unwritten data is actually missing,
usually the last 11111 byte (when writing in chunks of 10000 bytes).

If Privoxy sends the response in chunks of 1000 bytes, all the writes
seem to succeed, but occasionally the shutdown() fails and the client
doesn't get all the written data anyway.

The problem can be reproduced without Privoxy by using either gatling or
nc as servers, and fetch or curl as clients. I didn't try other combinations.

Clients and server are running on the same system.

Using gatling as server and curl as client, I'm frequently seeing similar
issues with body sizes of 90k or more while 80k files are fine:

fk@r500 ~ $i=1; while curl -o /dev/zero 'http://127.0.0.1:81/90k-file'; do echo $i; ((i=$i+1)); done
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 92160  100 92160    0     0  34.5M      0 --:--:-- --:--:-- --:--:-- 43.9M
1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 97 92160   97 89612    0     0   189k      0 --:--:-- --:--:-- --:--:--  189k
curl: (18) transfer closed with 2548 bytes remaining to read

With an 80k-file I aborted the loop after 500 successful transfers in a row.

What connections for failing transfers have in common seems to be that
the client side sends a TCP window size update before the server is done
transmitting and shortly thereafter the server side sets the FIN flag
for a packet that does not actually contain the end of the HTTP body.

While not all transfers on connections where the client side sends a TCP
window size while the transfer is in progress fail, I haven't seen failed
transfers on connections where the client side doesn't send a TCP window
update at all, or where the server side is already done transmitting when
the TCP window size update is received.

Even with larger response bodies the transfer either fails early in
roughly the same area, or doesn't fail at all.


An incomplete transfer (packet 495 already has the FIN flag set,
even though it doesn't contain the end of the HTTP response):

13:38:41.814496 IP (tos 0x0, ttl 64, id 469, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [S], seq 2309749620, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 3670927927 ecr 0], length 0
13:38:41.814594 IP (tos 0x0, ttl 64, id 470, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [S.], seq 2462976767, ack 2309749621, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 1613162519 ecr 3670927927], length 0
13:38:41.814644 IP (tos 0x0, ttl 64, id 471, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 1, ack 1, win 1275, options [nop,nop,TS val 3670927927 ecr 1613162519], length 0
[...]
13:38:41.816802 IP (tos 0x0, ttl 64, id 486, offset 0, flags [DF], proto TCP (6), length 16384)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [.], seq 79873:96205, ack 145, win 1275, options [nop,nop,TS val 1613162521 ecr 3670927929], length 16332
13:38:41.816842 IP (tos 0x0, ttl 64, id 487, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 145, ack 96205, win 60, options [nop,nop,TS val 3670927929 ecr 1613162521], length 0
13:38:41.816892 IP (tos 0x0, ttl 64, id 488, offset 0, flags [DF], proto TCP (6), length 104)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [P.], seq 96205:96257, ack 145, win 1275, options [nop,nop,TS val 1613162521 ecr 3670927929], length 52
13:38:41.834977 IP (tos 0x0, ttl 64, id 491, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 145, ack 96257, win 571, options [nop,nop,TS val 3670927947 ecr 1613162521], length 0
13:38:41.852693 IP (tos 0x0, ttl 64, id 492, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 145, ack 96257, win 1083, options [nop,nop,TS val 3670927965 ecr 1613162521], length 0
13:38:42.089105 IP (tos 0x0, ttl 64, id 493, offset 0, flags [DF], proto TCP (6), length 16384)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [.], seq 96257:112589, ack 145, win 1275, options [nop,nop,TS val 1613162794 ecr 3670927965], length 16332
13:38:42.189093 IP (tos 0x0, ttl 64, id 494, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 145, ack 112589, win 1275, options [nop,nop,TS val 3670928302 ecr 1613162794], length 0
13:38:42.535093 IP (tos 0x0, ttl 64, id 495, offset 0, flags [DF], proto TCP (6), length 104)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [FP.], seq 112589:112641, ack 145, win 1275, options [nop,nop,TS val 1613163240 ecr 3670928302], length 52
13:38:42.535141 IP (tos 0x0, ttl 64, id 496, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [.], seq 145, ack 112642, win 1275, options [nop,nop,TS val 3670928648 ecr 1613163240], length 0
13:38:42.535249 IP (tos 0x0, ttl 64, id 497, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.16274 > 10.0.0.1.82: Flags [F.], seq 145, ack 112642, win 1275, options [nop,nop,TS val 3670928648 ecr 1613163240], length 0
13:38:42.535269 IP (tos 0x0, ttl 64, id 498, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.82 > 10.0.0.1.16274: Flags [.], seq 112642, ack 146, win 1275, options [nop,nop,TS val 1613163240 ecr 3670928648], length 0


A complete transfer (packet 522 contains the last bytes of the HTTP response):

13:38:46.371832 IP (tos 0x0, ttl 64, id 499, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [S], seq 1075068732, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 4016437726 ecr 0], length 0
13:38:46.371931 IP (tos 0x0, ttl 64, id 500, offset 0, flags [DF], proto TCP (6), length 60)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [S.], seq 1854699263, ack 1075068733, win 65535, options [mss 16344,nop,wscale 6,sackOK,TS val 3342014437 ecr 4016437726], length 0
13:38:46.371982 IP (tos 0x0, ttl 64, id 501, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [.], seq 1, ack 1, win 1275, options [nop,nop,TS val 4016437726 ecr 3342014437], length 0
[...]
13:38:46.386674 IP (tos 0x0, ttl 64, id 520, offset 0, flags [DF], proto TCP (6), length 16384)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [.], seq 96257:112589, ack 145, win 1275, options [nop,nop,TS val 3342014452 ecr 4016437741], length 16332
13:38:46.386744 IP (tos 0x0, ttl 64, id 521, offset 0, flags [DF], proto TCP (6), length 16384)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [.], seq 112589:128921, ack 145, win 1275, options [nop,nop,TS val 3342014452 ecr 4016437741], length 16332
13:38:46.386778 IP (tos 0x0, ttl 64, id 522, offset 0, flags [DF], proto TCP (6), length 2477)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [P.], seq 128921:131346, ack 145, win 1275, options [nop,nop,TS val 3342014452 ecr 4016437741], length 2425
13:38:46.386845 IP (tos 0x0, ttl 64, id 523, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [.], seq 145, ack 128921, win 61, options [nop,nop,TS val 4016437741 ecr 3342014452], length 0
13:38:46.395842 IP (tos 0x0, ttl 64, id 524, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [.], seq 145, ack 131346, win 535, options [nop,nop,TS val 4016437750 ecr 3342014452], length 0
13:38:46.407723 IP (tos 0x0, ttl 64, id 525, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [.], seq 145, ack 131346, win 1047, options [nop,nop,TS val 4016437762 ecr 3342014452], length 0
13:38:46.417413 IP (tos 0x0, ttl 64, id 526, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [F.], seq 145, ack 131346, win 1275, options [nop,nop,TS val 4016437772 ecr 3342014452], length 0
13:38:46.417484 IP (tos 0x0, ttl 64, id 527, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [.], seq 131346, ack 146, win 1275, options [nop,nop,TS val 3342014483 ecr 4016437772], length 0
13:38:46.417523 IP (tos 0x0, ttl 64, id 528, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.82 > 10.0.0.1.27735: Flags [F.], seq 131346, ack 146, win 1275, options [nop,nop,TS val 3342014483 ecr 4016437772], length 0
13:38:46.417563 IP (tos 0x0, ttl 64, id 529, offset 0, flags [DF], proto TCP (6), length 52)
    10.0.0.1.27735 > 10.0.0.1.82: Flags [.], seq 146, ack 131347, win 1275, options [nop,nop,TS val 4016437772 ecr 3342014483], length 0

>How-To-Repeat:
Request 90k (or more) from HTTP server running on the same system.

If more than 5 complete response bodies are received in a row, the system probably isn't affected.

I didn't test with client and server running on different systems
or with protocols other than HTTP.
>Fix:
Downgrading sys/netinet/tcp_input.c to the state before r242262 works around the problem.

>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201211022329.qA2NT3ua049380>