From owner-freebsd-bugs Fri Sep 12 05:40:07 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id FAA00278 for bugs-outgoing; Fri, 12 Sep 1997 05:40:07 -0700 (PDT) Received: from citadel.cdsec.com (citadel.cdsec.com [192.96.22.18]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id FAA00206; Fri, 12 Sep 1997 05:39:58 -0700 (PDT) Received: (from nobody@localhost) by citadel.cdsec.com (8.8.5/8.6.9) id OAA27186; Fri, 12 Sep 1997 14:44:05 +0200 (SAT) Received: by citadel via recvmail id 27184; Fri Sep 12 14:44:02 1997 by gram.cdsec.com (8.8.5/8.8.5) id OAA20310; Fri, 12 Sep 1997 14:16:54 +0200 (SAT) From: Graham Wheeler Message-Id: <199709121216.OAA20310@cdsec.com> Subject: Re: Memory leak in getservbyXXX? To: mike@smith.net.au (Mike Smith) Date: Fri, 12 Sep 1997 14:16:53 +0200 (SAT) Cc: hackers@freebsd.org, freebsd-bugs@freebsd.org In-Reply-To: <199709121055.UAA02692@word.smith.net.au> from "Mike Smith" at Sep 12, 97 08:25:08 pm X-Mailer: ELM [version 2.4 PL25-h4.1] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-bugs@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > > > That's fairly odd; malloc()/free() do not call fstat(). Are you using > > > the system malloc() or the GNU version? > > > > The system malloc, as far as I know. And I did search the source for fstat, > > and didn't find it, so I agree this is odd. But that's what gdb is reporting > > in the stack backtrace... > > Just out of curiosity, seeing as it appears to be inside malloc inside > the stdio library, a couple of other questions; do you use the > funopen() functionality at all? Nope. > > > Not as far as I am aware; something like this would have been somewhat > > > of a showstopper I would expect. If you have a copy of the CVS > > > repository on hand extracting the changes between 2.1 and 2.2 would be > > > very straightforward. > > I just went over this; there's been very little at all changed there, > most of the differences are the addition of YP support. Yup - virtually no difference (rats!) > Each of these stuffups is actually deep inside the stdio library; I > suspect that this is just about all your application actually does with > stdio? Yes - all debugging, etc is done with syslog. There is a config file read at the start, and upon receipt of a SIGHUP. It is also possible to open a domain socket to the app and issue some commands, but this is for my use; the client isn't doing this. > btw. you might want to call setservent(1) before making lookups > to avoid the open/close overhead you're incurring with > every getserv* call you're currently making. Good idea. > Looking more, do you fiddle with the _bf._base or _nbuf fields in any > FILE structures? Particularly after you've closed the file? No, don't touch the FILE innards directly at all. > > process for nearly a year without a restart or reboot). So if the problem > > is a memory leak, for example, it may not show up except in situations like > > ours where there are so many calls. > > If it's a memory leak then it sounds like it's inside the stdio > library, possibly in fopen()/fclose(). Without more data it's going to > be hard to track this one. Tell me about it 8-( It does seem like a memory leak, as the memory use reported by top grows over time. I have memory allocation debugging code which confirms that I have no leaks in my code (at least of C++ objects), and the fact that older sites have been running for months seems to confirm this. Its a pity we can't just kill and restart the app every hour or so, but unfortunately that would result in any TCP connections across the firewall being terminated gracelessly; I think they may find that even more irritating than the current problem. > If you have a forgiving customer, it would be immensely useful to build > a copy of libc with debugging symbols and link your binary static, and > then have the user run that until it dies. I may have to do this, and give them a new kernel with kernel trace support compiled in. But this is quite tricky - they're about 1000 miles away, and don't have the skills to do a kernel update themselves, so it will be a last resort. > What happens if you install the 2.1 compatability kit and run the 2.1 > version of the application? Unfortunately this is also tricky - the format of the config file has changed between the two versions, and all the other apps (proxies, MTAs, admin agents, etc) use this. > > I'm considering prescanning the tables used once at the start and looking > > up all the services once only; this is a good idea from a performance point > > of view and may make the problem go away. > > If you can spare the memory, that's far and away the best approach. Well, it could at least give a good indication if it is in fact the calls to getservbyXXX that are the cause of the problem. regards Graham -- Dr Graham Wheeler E-mail: gram@cdsec.com Citadel Data Security Phone: +27(21)23-6065/6/7 Internet/Intranet Network Specialists Mobile: +27(83)-253-9864 Firewalls/Virtual Private Networks Fax: +27(21)24-3656 Data Security Products WWW: http://www.cdsec.com/