From owner-freebsd-net@FreeBSD.ORG Wed Feb 16 12:15:19 2005 Return-Path: Delivered-To: freebsd-net@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 575FA16A4CE for ; Wed, 16 Feb 2005 12:15:19 +0000 (GMT) Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1442F43D4C for ; Wed, 16 Feb 2005 12:15:19 +0000 (GMT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by cyrus.watson.org (Postfix) with SMTP id 8A97746B8D for ; Wed, 16 Feb 2005 07:15:18 -0500 (EST) Date: Wed, 16 Feb 2005 12:13:56 +0000 (GMT) From: Robert Watson X-Sender: robert@fledge.watson.org To: net@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: solisten() question: why do we check for completed connections? X-BeenThere: freebsd-net@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Networking and TCP/IP with FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Feb 2005 12:15:19 -0000 uipc_syscalls.c:solisten() is responsible for transitioning a socket from a non-listening state to a listening state. It does this at two levels: directly at the socket level, and at the protocol level by calling into the protocol using pru_listen(). I'm currently working on fixing a race between the two layers, but ran into the following question: a code fragment exists in solisten() that checks whether any completed connections are present when the protocol returns to solisten(): if no completed connections are present, it flags the socket as SO_ACCEPTCONN. This fragment has existed in some form or another, as data structures changed, since revision 1.1 when the BSD code was imported into our current CVS repository. Stevens volII also makes fleeting reference to this logic. However, the implied semantics don't appear to be documented in the listen(2) man page. Does anyone have any information on why it is that we conditionally set SO_ACCEPTCONN base on the completed connection queue being empty? The race I'd like to fix is that it's possible for a TCP SYN to come in during the state transition to a listening socket, which causes the TCP code to panic as it doesn't expect a SYN packet to match a TCPS_LISTEN tcpcb if the socket isn't SO_ACCEPTCONN. This was presumably introduced as part of the SMPng work, where preemption and pallelism are now "more possible". The easiest fix here would be to push the socket state transition down a layer into the protocol code, such that the socket locking and tests are performed while holding the TCP state locks, causing the multi-layer test-and-set to become atomic (although presumably using a helper function in the socket library functions that support most protocols). This would also close other potential races between multiple consumers of the socket in multiple threads. However, it would be quite simplifying to drop the logic regarding SO_ACCEPTCONN if it's not actually necessary. Anyone know anything about this? Robert N M Watson