From owner-freebsd-net@FreeBSD.ORG Mon Nov 17 18:37:57 2008 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A13241065678 for ; Mon, 17 Nov 2008 18:37:57 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 0F97F8FC1B for ; Mon, 17 Nov 2008 18:37:56 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 1324 invoked from network); 17 Nov 2008 16:36:49 -0000 Received: from localhost (HELO [127.0.0.1]) ([127.0.0.1]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 17 Nov 2008 16:36:49 -0000 Message-ID: <4921B3C6.5020002@freebsd.org> Date: Mon, 17 Nov 2008 19:11:18 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.14 (Windows/20071210) MIME-Version: 1.0 To: Hartmut Brandt References: <491F2C47.4050500@dlr.de> <0A4BB2F1-AC9F-4316-94E3-790E2D80F651@freebsd.org> <49201859.2080605@dlr.de> In-Reply-To: <49201859.2080605@dlr.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, Rui Paulo Subject: Re: TCP and syncache question X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Nov 2008 18:37:57 -0000 Hartmut Brandt wrote: > Rui Paulo wrote: >> >> On 15 Nov 2008, at 20:08, Hartmut Brandt wrote: >> >>> Hi, >>> >>> in tcp_syncache.c:syncache_expand() there is a test that the >>> acknowledgement number and the sequence number of an incoming ACK >>> segment are in the expected range. If they are not, syncache_expand() >>> returns 0 and tcp_input drops the segment and sets a reset. So far so >>> good. But syncache_expand() also deletes the syncache entry, and so >>> destroys the connection. I cannot see why it does it. It seems to me >>> that such a wrong segment should be interpreted as to be from another >>> connection and as such the segment should be ignored (but a reset >>> sent). When the correct ACK comes, the connection could still be >>> established. As it is now, the establishment of incoming connections >>> can seriously be disturbed by someone sending fake ACK packets. >>> >>> The same test (for the ack number, not for the sequence number) is >>> also further down in tcp_input.c:tcp_do_segment() (just after the >>> header prediction stuff) and here the handling is correct: the goto >>> dropwithreset just sends a reset and drops the segment but leaves the >>> connection in the SYN-RECEIVED state. This test is probably never >>> reached now, because of syncache_expand(), though. Correct. >>> Maybe I fail to see something obvious, though... >> >> >> Well, if the RST is sent, why should we keep the syncache entry? > > Because this effectively destroys the connection in SYN-RECEIVED which > is wrong according to RFC793. On page 69 the handling of incoming > segments for connections in SYN-RECEIVED is described: first you check > the sequence number and, if it is wrong, you send an RST (unless the RST > bit is set in the incoming segment), but otherwise ignore the segment. > > A segment with a bad sequence number in SYN-RECEIVED is either forged or > from an old connection. In both cases you don't want to destroy the > embryonic connection, because the correct ACK from the correct peer may > still arrive. I see your problem. Syncookies mitigate this problem (if not disabled) as the correct ACK will pass that test later even if the syncache entry went away before (which can also happen due to a generally high SYN load). RFC793 wants us to do the following: Page 69: Send back a challenge ACK with the correct parameters to help to re-synchronize the connection when !(RCV.NXT =< SEG.SEQ+SEG.LEN-1 < RCV.NXT+RCV.WND). Page 72: Send back a RST when !(SND.UNA =< SEG.ACK =< SND.NXT). At the moment we send the RST and delete the syncache entry for both cases. However we should send the ACK in the former, the RST in the latter, and and keep the syncache entry in either case. Fixing this requires some re-shuffling of the syncache_expand(). I'll post a version later today. -- Andre