From owner-freebsd-questions@FreeBSD.ORG  Mon Jul 19 05:13:25 2010
Return-Path: <owner-freebsd-questions@FreeBSD.ORG>
Delivered-To: freebsd-questions@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 598011065670
	for <freebsd-questions@freebsd.org>;
	Mon, 19 Jul 2010 05:13:25 +0000 (UTC)
	(envelope-from bmettee@pchotshots.com)
Received: from mail.pchotshots.com (ns1.pchotshots.com [12.172.123.235])
	by mx1.freebsd.org (Postfix) with SMTP id F23038FC15
	for <freebsd-questions@freebsd.org>;
	Mon, 19 Jul 2010 05:13:24 +0000 (UTC)
Received: (qmail 88570 invoked by uid 89); 19 Jul 2010 05:13:24 -0000
Received: from unknown (HELO ?12.172.123.228?)
	(bmettee@pchotshots.com@12.172.123.228)
	by mail.pchotshots.com with SMTP; 19 Jul 2010 05:13:24 -0000
Message-ID: <4C43DEF5.5060707@pchotshots.com>
Date: Mon, 19 Jul 2010 01:13:25 -0400
From: Brad Mettee <bmettee@pchotshots.com>
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: Robert Bonomi <bonomi@mail.r-bonomi.com>
References: <201007190326.o6J3QvqL022578@mail.r-bonomi.com>
In-Reply-To: <201007190326.o6J3QvqL022578@mail.r-bonomi.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-questions@freebsd.org
Subject: Re: Has anybody got a *working* example of getpwnam_r() ??
X-BeenThere: freebsd-questions@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: User questions <freebsd-questions.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-questions>
List-Post: <mailto:freebsd-questions@freebsd.org>
List-Help: <mailto:freebsd-questions-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-questions>, 
	<mailto:freebsd-questions-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Jul 2010 05:13:25 -0000

Robert Bonomi wrote:
> I've _got_ to be doing something wrong, sine I'm getting heap corruption
> calling it.  But for the life of me, I can't figure out -what- is wrong.
>
> What I've got makes "a whole lot of no sense" -- I get corruption of
> the _same_ malloc()'d data structure (at *exactly* the same offset into
> the structure!!) on both 7.2 i386 and 8.0 amd64 releases (on different
> hardware).
>
> Unfortunately thee is a _lot_ of code, including significant use of
> of malloc()/free() that attempmting to whittle things down to a
> minimal test case would be very awkward.  right now, I've got the
> corruption at a 'known' place, but no cluse as to -how- it's 
> happening -- available evidence seems to exclude everything passed
> _into_ getwpnam_r(). 
>
>
> the offending call is :  
>
>    getpwnam_r(cp3, &pw_data, buffer2, sizeof(buffer2), &pwd);
>
> data declaration at the beginning of the function:
>    char buffer[1024];
>    char buffer2[1024];
>    char mailbox[1024];
>   *cp,*cp2,*cp3,*cp4 = buffer;
>    struct passwd pw_data,pw_data2,*pwd=&pw_data2;
>    int i;
>
> The whole program is around 2500 lines of code and headers, with, as 
> mentioined, _lots_ of malloc()/free() activity,   I can put ut it up on
> my web-server, if somebody really wants to dig.
>
> I've tried changing the size of buffer2 to 8kb, in case I was
> over-running the 1k buffer. ((unfortunately the mmanpage does _not_
> specify a minimum size for the buffer)
> I've tried declaring buffer2 _and_ the 'struct passwd' items as
> 'static', so that _if_ the corruption was coming from one of
> those addresses, the corruption *should* move.
>
> *NONE* of those changes made _any_ differnce in  where the corruption
> was occuring, or _what_ was being written there.
>
> I'm *really* baffled.   HELP!!! <*whimper*>
>
> _what_ stuff shows up _does_ differ between the 7.2 and 8.0 systems,
>
> 8.0 reliably produces 116 bytes corrupted:
> [gdb command: x/112 &private_data_pointer->remotehostname
> 0x40a0e070:     103 'g' 114 'r' 111 'o' 117 'u' 112 'p' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e078:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
> 0x40a0e080:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
> 0x40a0e088:     104 'h' 111 'o' 115 's' 116 't' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e090:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e098:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e0a0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e0a8:     112 'p' 97 'a'  115 's' 115 's' 119 'w' 100 'd' 0 '\0'  0 '\0'
> 0x40a0e0b0:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
> 0x40a0e0b8:     115 's' 104 'h' 101 'e' 108 'l' 108 'l' 115 's' 0 '\0'  0 '\0'
> 0x40a0e0c0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e0c8:     99 'c'  111 'o' 109 'm' 112 'p' 97 'a'  116 't' 0 '\0'  0 '\0'
> 0x40a0e0d0:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
> 0x40a0e0d8:     102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0'  0 '\0'  0 '\0'
>
>
> 7.2 produces 64 bytes corrupted:
> [gdb command: x/112 &private_data_pointer->remotehostname
> 0x2820b098:     100 'd' 110 'n' 115 's' 0 '\0'  110 'n' 105 'i' 115 's' 0 '\0'
> 0x2820b0a0:     110 'n' 105 'i' 115 's' 0 '\0'  114 'r' 112 'p' 99 'c'  0 '\0'
> 0x2820b0a8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
> 0x2820b0b0:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
> 0x2820b0b8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
> 0x2820b0c0:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  3 '\003'        0 '\0'  8 '\b' 0 '\0'
> 0x2820b0c8:     0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'
> 0x2820b0d0:     2 '\002'        0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0'  0 '\0' 0 '\0'
>
> This stuff looks like it -might- be fromm a nsswitch.conf parse.
> I dunno.
>
> anybody got _any_ ideas?
>   
This might help.

Just above the crashing call, open a file, and dump the contents of the 
vars you're sending to the function with fprintf, then close the file. I 
suspect cp3 doesn't point to valid data and is causing the function call 
to fail (it looks like it's pointing to the same thing as buffer, which 
doesn't look like how the function should be called). Having the 
contents of the individual vars will help you narrow down exactly what's 
occuring. Data you want to see is the pointers themselves, and maybe the 
first 8 or so characters of data that it's pointing to.

And if this doesn't help, there's always Google CodeSearch 
http://www.google.com/codesearch , for examples of how to call it.