From owner-freebsd-stable@FreeBSD.ORG Tue Nov 27 13:53:22 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EF5E16A419 for ; Tue, 27 Nov 2007 13:53:22 +0000 (UTC) (envelope-from w@wrzask.pl) Received: from mx.oak.pl (mx.oak.pl [217.96.108.251]) by mx1.freebsd.org (Postfix) with ESMTP id 4E87B13C46E for ; Tue, 27 Nov 2007 13:53:22 +0000 (UTC) (envelope-from w@wrzask.pl) Received: by oak.pl (Postfix, from userid 1002) id DFB7E1CD25; Tue, 27 Nov 2007 14:53:20 +0100 (CET) Date: Tue, 27 Nov 2007 14:53:20 +0100 From: Jan Srzednicki To: freebsd-net@freebsd.org, freebsd-stable@freebsd.org Message-ID: <20071127135320.GJ2045@oak.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.16 (2007-06-09) Cc: Subject: connect() returns EADDRINUSE during massive host->host conn rate X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 Nov 2007 13:53:22 -0000 Hello, I have a pair of hosts. One of them performs a massive amount of TCP connections to the other one, all to the same port. This setup mostly works fine, but from time to time (that varies, from once a minute to one a half an hour), the connect(2) syscall fails with EADDRINUSE. The connection rate tops to 50 connection initiations/second. The socket is non-blocking. It does standard job of creating the socket, setting up the relevant fields, setting SO_REUSEADDR and SO_KEEPALIVE, setting O_NONBLOCK on the descriptor. No bind(2) is performed. The connection is initiated from inside a jail (not sure if that implies a internal bind(2) to the jail's address). There are no connections from the other host to the first one. I've tried tuning the net.inet.ip.portrange variables: I've increased the available portrange to over 45000 ports (quite a lot, should be more than enough for just anything) and I've toggled net.inet.ip.portrange.randomized off, but that didn't change anything. The workaround on the application side - retrying on EADDRINUSE - works pretty well, but hey, from what I know from the Stevens book, that shouldn't be happening, though Google said all BSD had a bad habit of throwing out EADDRINUSE from time to time. This all happens on a 6.2-RELEASE system. The symptoms are easily reproducable in my environment. Is there any known fix for that? If there ain't, can it be fixed? :) -- Jan Srzednicki :: http://wrzask.pl/ "Remember, remember, the fifth of November" -- V for Vendetta