From owner-freebsd-current@FreeBSD.ORG Wed Oct 11 19:04:31 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C465616A4E5 for ; Wed, 11 Oct 2006 19:04:31 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id D0B7243DA1 for ; Wed, 11 Oct 2006 19:03:25 +0000 (GMT) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.6/8.13.6) with ESMTP id k9BJ339g098451; Wed, 11 Oct 2006 15:03:05 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: "Sean C. Farley" Date: Wed, 11 Oct 2006 14:27:41 -0400 User-Agent: KMail/1.9.1 References: <20061006200320.T1063@baba.farley.org> <200610101001.04286.jhb@freebsd.org> <20061011105435.A10713@thor.farley.org> In-Reply-To: <20061011105435.A10713@thor.farley.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200610111427.42195.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Wed, 11 Oct 2006 15:03:05 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/2024/Wed Oct 11 06:53:09 2006 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: freebsd-current@freebsd.org Subject: Re: Fix for memory leak in setenv/unsetenv X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Oct 2006 19:04:32 -0000 On Wednesday 11 October 2006 12:15, Sean C. Farley wrote: > On Tue, 10 Oct 2006, John Baldwin wrote: > > > On Friday 06 October 2006 21:13, Sean C. Farley wrote: > >> Many a moon ago[1], I put together a patch to fix the leak in > >> setenv() and unsetenv(). A few months ago, I submitted a PR > >> (kern/99826[2]) for the final fix. I was wondering if anyone would > >> take a look at it to see if any changes are still warranted. The PR > >> contains information about the patch and sample programs to test it > >> out. > >> > >> Thank you. > >> > >> Sean > >> 1. http://lists.freebsd.org/pipermail/freebsd-hackers/2005-February/010463.html > >> 2. http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/99826 > > > > This still won't work. The reason for the intentional leak is because > > of this code sequence: > > > > char *a; > > > > setenv("FOO", "0", 1); > > a = getenv("FOO"); > > setenv("FOO", "bar", 1); > > printf("FOO was %s\n", a); > > > > With the memory leak fixed this will use free'd memory. While this > > code may seem weird in a program, it actually is quite possible for a > > library to read and cache the value of an environment variable. If > > you didn't leave the leak around, the library could cause a crash if > > the main program (or another library) changed the environment variable > > the first library had a cached pointer to the value of. > > Although it would not crash, the following would fail anyway: > > setenv("FOO", "bar", 1); > a = getenv("FOO"); > setenv("FOO", "0", 1); > printf("FOO was %s\n", a); > > In this scenario, the printf() would print "0" since the second value > had a string length less than or equal to the previous value. The > current implementation of setenv() would reuse the string instead of > malloc'ing a new one. Yeah, but it doesn't crash is the point actually. The pointer is still valid, though it may be overwritten with a newer value, it's still valid and a library can reliably doing getenv() and that pointer will always point to some value of that variable, but it won't ever point to anything else. > Also, this snippet from IEEE Std 1003.1, 2004 Edition regarding > getenv()[3]: > > The return value from getenv() may point to static data which may be > overwritten by subsequent calls to getenv(), setenv(), or > unsetenv(). > > After the call to the second setenv(), a portable application should not > assume that a still points to the same value. Also, it says "may point > to static data" suggesting (at least to me) that the pointer may point > to dynamic memory and be freed following the call to setenv(). No, static memory means it won't be free'd but is in bss, etc. This statement basically backs up exactly what getenv/setenv currently do: the value may be overwritten. That paragraph doesn't say that the pointer will become invalid, just that what it points to may be stale or be overwritten after the getenv(3) call. Part of the problem is that we have no way to notify consumers of an environment variable when its value is changed. Alternatively, we could add a different variant of getenv that required the user to supply the buffer, but that's not the API we have. > > I know for one app at my last job we had a problem with this with TZ, > > and so we explicitly space padded the timezone name out to a > > fixed-size each time to avoid the leak. > > This is what I am trying to fix in setenv(). :) I know, and we went with the above workaround rather than hacking setenv/getenv. :) -- John Baldwin