From owner-cvs-all@FreeBSD.ORG Fri Aug 4 05:39:46 2006 Return-Path: X-Original-To: cvs-all@FreeBSD.org Delivered-To: cvs-all@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 509B516A4DD; Fri, 4 Aug 2006 05:39:46 +0000 (UTC) (envelope-from freebsd-cvs-src@oldach.net) Received: from rigel.oldach.net (rigel.oldach.net [194.8.96.250]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9A3DC43D46; Fri, 4 Aug 2006 05:39:45 +0000 (GMT) (envelope-from freebsd-cvs-src@oldach.net) Received: from sep.oldach.net (p548F8CF4.dip0.t-ipconnect.de [84.143.140.244]) by rigel.oldach.net (8.13.6/8.13.6/hmo30jul04) with ESMTP id k745dQwW070659 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 4 Aug 2006 07:39:28 +0200 (CEST) (envelope-from freebsd-cvs-src@oldach.net) Received: from sep.oldach.net (localhost [127.0.0.1]) by sep.oldach.net (8.13.6/8.13.6/hmo26jun05) with ESMTP id k745dMZh063281 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 4 Aug 2006 07:39:22 +0200 (CEST) (envelope-from freebsd-cvs-src@oldach.net) Received: (from hmo@localhost) by sep.oldach.net (8.13.6/8.13.6/Submit/hmo26jun05) id k745dM9K063280; Fri, 4 Aug 2006 07:39:22 +0200 (CEST) (envelope-from freebsd-cvs-src@oldach.net) Message-Id: <200608040539.k745dM9K063280@sep.oldach.net> In-Reply-To: <20060803104026.A45647@fledge.watson.org> from Robert Watson at "Aug 3, 2006 10:43:21 am" To: rwatson@FreeBSD.org (Robert Watson) Date: Fri, 4 Aug 2006 07:39:22 +0200 (CEST) From: freebsd-cvs-src@oldach.net (Helge Oldach) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded STARTTLS authentication, not delayed by milter-greylist-2.0.2 (rigel.oldach.net [194.8.96.250]); Fri, 04 Aug 2006 07:39:29 +0200 (CEST) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0.2 (sep.oldach.net [127.0.0.1]); Fri, 04 Aug 2006 07:39:22 +0200 (CEST) X-Virus-Scanned: ClamAV 0.88.3/1634/Thu Aug 3 00:32:49 2006 on rigel.oldach.net X-Virus-Scanned: ClamAV version 0.88.3, clamav-milter version 0.88.3 on sep.oldach.net X-Virus-Status: Clean X-Spam-Flag: NO X-Scanned-By: milter-spamc/0.25.321 (rigel.oldach.net [194.8.96.250]); Fri, 04 Aug 2006 07:39:30 +0200 X-Spam-Status: NO, hits=2.30 required=5.00 X-Spam-Level: ** Cc: scottl@samsco.org, kensmith@cse.Buffalo.EDU, ume@FreeBSD.org, cvs-src@FreeBSD.org, bz@FreeBSD.org, cvs-all@FreeBSD.org, src-committers@FreeBSD.org, freebsd-cvs-src@oldach.net Subject: Re: cvs commit: src/sys/sys param.h src/include Makefile netdb.h res_update.h resolv.h src/include/arpa inet.h nameser.h nameser X-BeenThere: cvs-all@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the entire tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Aug 2006 05:39:46 -0000 Robert Watson: > > Well... I've spotted a regression not with the ports tree but with 6-STABLE. > > On several boxes with this change applied I see lots of sendmails stacking > > up over time Just like to add that I've seen this with other processes as well, e.g. with ftp jobs that pick up the latest CTM deltas via cron. They get stuck in an early TCP phase. Guessing from tcpdumps it's right after the initial handshake. In netstat they show up as ESTABLISHED. > We only concluded that it was not a kernel socket bug a day or so ago, so I'm > not sure he's had a chance to generate a resolver bug report. He reported > that the application appeared to have two connected UDP sockets for name > resolution, and one bad name server entry, but that the resolver appeared to > be blocked in a read on the UDP socket that didn't have data queued, rather > than the one that did. So it seems. To add another observation: In netstat, the entries relating to the processes disappear after some time. The sockets clear up, but the process is still stuck. Also the UDP socket (resolver queries) disappears. Once I kill one such process, however, it sends a FIN packet and the TCP socket shows up again in netstat. So it's not completely dead but "just" stuck. I don't have a clue what is going on. Maybe it's not related to the resolver update, but this update just triggers another issue. I'm no resolver expert, but didn't BIND9 implement a new event library? On another occasion I had the vague feeling that it might be related to the calcru issue that was discussed in several threads recently. Just to verify that the issue hadn't been caught in the meantime I built a system from yesterday's 6-STABLE, but the issue is still present. :-( Helge