Date: Tue, 16 Sep 1997 17:35:19 +0200 (SAT) From: Graham Wheeler <gram@cdsec.com> To: phk@critter.freebsd.dk (Poul-Henning Kamp) Cc: freebsd-bugs@freebsd.org, hackers@freebsd.org Subject: Re: Memory leak in getservbyXXX? Message-ID: <199709161535.RAA00281@cdsec.com> In-Reply-To: <2508.874421096@critter.freebsd.dk> from "Poul-Henning Kamp" at Sep 16, 97 04:44:56 pm
next in thread | previous in thread | raw e-mail | index | archive | help
First off, I should say that I have found the problem. It is entirely my own code, so apologies for wasting people's time. Nontheless, I appreciate all the suggestions I got, and I learnt some useful stuff in the process (like the malloc options, which I never knew existed, not having done a `man malloc' for several years...) > I've looked at these stack traces, and I'm pretty sure I know the > smell of this one. > > My guess number one is that malloc bails out and that the topmost > couple of entries come from the __cleanup that happens in abort(). > > Make sure that filedescriptor #2 (as in: write(2,"FOO!",4)) is open > and points to something that will let you read the message, and look > for messages there. I'm impressed. The program is run as a daemon, so we weren't seeing the error message. Yesterday I had the idea of redirecting fd 2 when starting it up, and got the clue I needed; namely, malloc was reporting a recursive call. At the time that the gateway program was first written, FreeBSD did not yet support POSIX threads. Because the process is performance-critical, I wanted to avoid blocking calls as much as possible. Amongst such calls were DNS lookups (gethostbyXXX). So I got clever: I made a hash table of IP addresses to symbolic names, and I wrote code which spins off a child process for each DNS lookup which can't be satisfied from the cache. The child process communicates the result back to the parent via a pipe, and the parent adds this to the hash table. Because I wanted this to be completely asynchronous to the normal operation of the gateway, the parent reads from the pipe and inserts the result in the hash table in the SIGCHLD handler. The problem is that inserting an entry in the hash table requires dynamic memory allocation. So if the SIGCHLD happens while in a call to malloc(), then there is a recursive call. I'm currently trying to work around this by preallocating a hash table entry for the result before spinning off the child, and making sure that there are no calls to any other non-reentrant functions in the signal handler. Later I will use threads rather than a child process. It turns out that this was happening under FreeBSD 2.1, but that (i) it doesn't seem to happen as often and (ii) the process always exits. There is a second process running on the firewall whose task is somewhat like SysV's init - namely to keep restarting terminated processes. So we didn't notice it. Under FreeBSD 2.2.2, the process occasionally exits, but in most cases starts spinning in a busy loop, so we did notice it. I am assuming that when the process spins it is caused by the same problem; there is always the possibility that there are two bugs, and that the above will only fix the case where the process exits. I guess I'll find out soon enough... [In my defense I should state that the memory allocation is happening in a virtual method of a class about four levels deeper in the inheritance hierarchy - by the time that method was written I had probably forgotten that it was being called from within a signal handler. I guess this illustrates one of the potential hazards of systems programming in C++.] graham -- Dr Graham Wheeler E-mail: gram@cdsec.com Citadel Data Security Phone: +27(21)23-6065/6/7 Internet/Intranet Network Specialists Mobile: +27(83)-253-9864 Firewalls/Virtual Private Networks Fax: +27(21)24-3656 Data Security Products WWW: http://www.cdsec.com/
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199709161535.RAA00281>