From owner-freebsd-questions@FreeBSD.ORG Mon Jul 19 05:13:25 2010 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 598011065670 for ; Mon, 19 Jul 2010 05:13:25 +0000 (UTC) (envelope-from bmettee@pchotshots.com) Received: from mail.pchotshots.com (ns1.pchotshots.com [12.172.123.235]) by mx1.freebsd.org (Postfix) with SMTP id F23038FC15 for ; Mon, 19 Jul 2010 05:13:24 +0000 (UTC) Received: (qmail 88570 invoked by uid 89); 19 Jul 2010 05:13:24 -0000 Received: from unknown (HELO ?12.172.123.228?) (bmettee@pchotshots.com@12.172.123.228) by mail.pchotshots.com with SMTP; 19 Jul 2010 05:13:24 -0000 Message-ID: <4C43DEF5.5060707@pchotshots.com> Date: Mon, 19 Jul 2010 01:13:25 -0400 From: Brad Mettee User-Agent: Thunderbird 2.0.0.24 (Windows/20100228) MIME-Version: 1.0 To: Robert Bonomi References: <201007190326.o6J3QvqL022578@mail.r-bonomi.com> In-Reply-To: <201007190326.o6J3QvqL022578@mail.r-bonomi.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-questions@freebsd.org Subject: Re: Has anybody got a *working* example of getpwnam_r() ?? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Jul 2010 05:13:25 -0000 Robert Bonomi wrote: > I've _got_ to be doing something wrong, sine I'm getting heap corruption > calling it. But for the life of me, I can't figure out -what- is wrong. > > What I've got makes "a whole lot of no sense" -- I get corruption of > the _same_ malloc()'d data structure (at *exactly* the same offset into > the structure!!) on both 7.2 i386 and 8.0 amd64 releases (on different > hardware). > > Unfortunately thee is a _lot_ of code, including significant use of > of malloc()/free() that attempmting to whittle things down to a > minimal test case would be very awkward. right now, I've got the > corruption at a 'known' place, but no cluse as to -how- it's > happening -- available evidence seems to exclude everything passed > _into_ getwpnam_r(). > > > the offending call is : > > getpwnam_r(cp3, &pw_data, buffer2, sizeof(buffer2), &pwd); > > data declaration at the beginning of the function: > char buffer[1024]; > char buffer2[1024]; > char mailbox[1024]; > *cp,*cp2,*cp3,*cp4 = buffer; > struct passwd pw_data,pw_data2,*pwd=&pw_data2; > int i; > > The whole program is around 2500 lines of code and headers, with, as > mentioined, _lots_ of malloc()/free() activity, I can put ut it up on > my web-server, if somebody really wants to dig. > > I've tried changing the size of buffer2 to 8kb, in case I was > over-running the 1k buffer. ((unfortunately the mmanpage does _not_ > specify a minimum size for the buffer) > I've tried declaring buffer2 _and_ the 'struct passwd' items as > 'static', so that _if_ the corruption was coming from one of > those addresses, the corruption *should* move. > > *NONE* of those changes made _any_ differnce in where the corruption > was occuring, or _what_ was being written there. > > I'm *really* baffled. HELP!!! <*whimper*> > > _what_ stuff shows up _does_ differ between the 7.2 and 8.0 systems, > > 8.0 reliably produces 116 bytes corrupted: > [gdb command: x/112 &private_data_pointer->remotehostname > 0x40a0e070: 103 'g' 114 'r' 111 'o' 117 'u' 112 'p' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e078: 99 'c' 111 'o' 109 'm' 112 'p' 97 'a' 116 't' 0 '\0' 0 '\0' > 0x40a0e080: 99 'c' 111 'o' 109 'm' 112 'p' 97 'a' 116 't' 0 '\0' 0 '\0' > 0x40a0e088: 104 'h' 111 'o' 115 's' 116 't' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e090: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e098: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e0a0: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e0a8: 112 'p' 97 'a' 115 's' 115 's' 119 'w' 100 'd' 0 '\0' 0 '\0' > 0x40a0e0b0: 99 'c' 111 'o' 109 'm' 112 'p' 97 'a' 116 't' 0 '\0' 0 '\0' > 0x40a0e0b8: 115 's' 104 'h' 101 'e' 108 'l' 108 'l' 115 's' 0 '\0' 0 '\0' > 0x40a0e0c0: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e0c8: 99 'c' 111 'o' 109 'm' 112 'p' 97 'a' 116 't' 0 '\0' 0 '\0' > 0x40a0e0d0: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > 0x40a0e0d8: 102 'f' 105 'i' 108 'l' 101 'e' 115 's' 0 '\0' 0 '\0' 0 '\0' > > > 7.2 produces 64 bytes corrupted: > [gdb command: x/112 &private_data_pointer->remotehostname > 0x2820b098: 100 'd' 110 'n' 115 's' 0 '\0' 110 'n' 105 'i' 115 's' 0 '\0' > 0x2820b0a0: 110 'n' 105 'i' 115 's' 0 '\0' 114 'r' 112 'p' 99 'c' 0 '\0' > 0x2820b0a8: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' > 0x2820b0b0: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' > 0x2820b0b8: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' > 0x2820b0c0: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 3 '\003' 0 '\0' 8 '\b' 0 '\0' > 0x2820b0c8: 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' > 0x2820b0d0: 2 '\002' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' 0 '\0' > > This stuff looks like it -might- be fromm a nsswitch.conf parse. > I dunno. > > anybody got _any_ ideas? > This might help. Just above the crashing call, open a file, and dump the contents of the vars you're sending to the function with fprintf, then close the file. I suspect cp3 doesn't point to valid data and is causing the function call to fail (it looks like it's pointing to the same thing as buffer, which doesn't look like how the function should be called). Having the contents of the individual vars will help you narrow down exactly what's occuring. Data you want to see is the pointers themselves, and maybe the first 8 or so characters of data that it's pointing to. And if this doesn't help, there's always Google CodeSearch http://www.google.com/codesearch , for examples of how to call it.