From owner-freebsd-current@FreeBSD.ORG Mon Jun 7 19:31:22 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7212316A500 for ; Mon, 7 Jun 2004 19:31:22 +0000 (GMT) Received: from winston.piwebs.com (217-19-20-178.dsl.cambrium.nl [217.19.20.178]) by mx1.FreeBSD.org (Postfix) with SMTP id 463AA43D48 for ; Mon, 7 Jun 2004 19:31:21 +0000 (GMT) (envelope-from avleeuwen@piwebs.com) Received: (qmail 28039 invoked from network); 7 Jun 2004 19:31:11 -0000 Received: from vincent.piwebs.com (192.168.0.84) by winston.piwebs.com with SMTP; 7 Jun 2004 19:31:11 -0000 From: Arjan van Leeuwen To: freebsd-current@freebsd.org Date: Mon, 7 Jun 2004 21:31:12 +0200 User-Agent: KMail/1.6.2 References: In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_CKMxApEJAcbkiOp"; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <200406072131.14477.avleeuwen@piwebs.com> cc: Robert Watson cc: current@freebsd.org cc: "David A. Benfell" Subject: Re: file descripter leak in current with Qmail? X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: avleeuwen@piwebs.com List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Jun 2004 19:31:22 -0000 --Boundary-02=_CKMxApEJAcbkiOp Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Hi, On Monday 07 June 2004 19:06, Robert Watson wrote: > On Mon, 7 Jun 2004, David A. Benfell wrote: (...) > > However, I think the more serious element here is the reason why you reach > the limit: this happens "naturally" under some workloads simply because of > large numbers of open files and network connections. However, in some > workloads, it's a symptom of a system or application bug, such as a > resource leak. > > Because the resources were returned when qmail was killed, that largely > eliminates the possibility of a kernel resource leak (not entirely, but > largely), as most kernel resource leaks involving file descriptors have > the symptom that even after the process exits, the resources aren't > release (i.e., a reference counting bug or race). This suggests a user > space issue -- that doesn't eliminate a system bug, as it could be a bug > in a library that manages descriptors, but it also suggests the > possibility of an application bug, or at least, a poor application > interaction with a system bug. Occasionally, we've seen bugs in the > threading libraries that result in leaked descriptors, but my recollection > is that qmail doesn't use threads. So that suggests either a support > library (perhaps crypto or the like), or qmail itself. Or that you just > hit an extremely high load. :-) > > In terms of debugging it: your first task it to identify if there's one > process that's holding all the fd's, or if it is distributed over many > proceses. After that, you want to track down what kind of fd is being > left open, which may help you track down why it's left open... Just as I'm reading this, I'm seeing the same thing on my -CURRENT server,= =20 which has a _very_ low load (atm, it's only routing the internet traffic fo= r=20 3 users and serving SMTP for 2 of them). I'm also running qmail. The kernel= =20 is from June 6. How do I go about investigating this further? Best regards, Arjan van Leeuwen --Boundary-02=_CKMxApEJAcbkiOp Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (FreeBSD) iD8DBQBAxMKC3Ym57eNCXiERAgr6AKCpCl1RBjeRiFRLyIsAVcD+FNmXGACggU5Q 2FGfMyXMcSZbNRAjBAuntdk= =bOOd -----END PGP SIGNATURE----- --Boundary-02=_CKMxApEJAcbkiOp--