From owner-freebsd-stable@FreeBSD.ORG Wed Nov 28 18:33:56 2007 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E51A16A4B3 for ; Wed, 28 Nov 2007 18:33:56 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outX.internet-mail-service.net (outX.internet-mail-service.net [216.240.47.247]) by mx1.freebsd.org (Postfix) with ESMTP id 8280413C458 for ; Wed, 28 Nov 2007 18:33:56 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Wed, 28 Nov 2007 10:22:15 -0800 X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e X-Client-Authorized: MaGic Cook1e Received: from julian-mac.elischer.org (nat.ironport.com [63.251.108.100]) by idiom.com (Postfix) with ESMTP id 04C2F126B25; Wed, 28 Nov 2007 10:22:14 -0800 (PST) Message-ID: <474DB1D0.3010100@elischer.org> Date: Wed, 28 Nov 2007 10:22:08 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jan Srzednicki References: <20071127135320.GJ2045@oak.pl> In-Reply-To: <20071127135320.GJ2045@oak.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-net@freebsd.org, freebsd-stable@freebsd.org Subject: Re: connect() returns EADDRINUSE during massive host->host conn rate X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2007 18:33:56 -0000 Jan Srzednicki wrote: > Hello, > > I have a pair of hosts. One of them performs a massive amount of > TCP connections to the other one, all to the same port. This setup > mostly works fine, but from time to time (that varies, from once a > minute to one a half an hour), the connect(2) syscall fails with > EADDRINUSE. The connection rate tops to 50 connection so, what does netstat -aAn show? > initiations/second. > > The socket is non-blocking. It does standard job of creating the socket, > setting up the relevant fields, setting SO_REUSEADDR and SO_KEEPALIVE, > setting O_NONBLOCK on the descriptor. No bind(2) is performed. The > connection is initiated from inside a jail (not sure if that implies a > internal bind(2) to the jail's address). There are no connections from > the other host to the first one. > > I've tried tuning the net.inet.ip.portrange variables: I've increased > the available portrange to over 45000 ports (quite a lot, should be more > than enough for just anything) and I've toggled > net.inet.ip.portrange.randomized off, but that didn't change anything. > > The workaround on the application side - retrying on EADDRINUSE - works > pretty well, but hey, from what I know from the Stevens book, that > shouldn't be happening, though Google said all BSD had a bad habit of > throwing out EADDRINUSE from time to time. > > This all happens on a 6.2-RELEASE system. The symptoms are easily > reproducable in my environment. > > Is there any known fix for that? If there ain't, can it be fixed? :) >