From owner-freebsd-hackers Thu Nov 28 0:55:29 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ABE0637B401 for ; Thu, 28 Nov 2002 00:55:25 -0800 (PST) Received: from mailout.informatik.tu-muenchen.de (mailout.informatik.tu-muenchen.de [131.159.0.5]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9AE2743EA9 for ; Thu, 28 Nov 2002 00:55:24 -0800 (PST) (envelope-from langd@informatik.tu-muenchen.de) Received: from mailrelay1.informatik.tu-muenchen.de (mailrelay1.informatik.tu-muenchen.de [131.159.254.5]) by mailout.informatik.tu-muenchen.de (Postfix) with ESMTP id 8095862AA; Thu, 28 Nov 2002 09:55:23 +0100 (MET) Received: from atrbg11.informatik.tu-muenchen.de (atrbg11.informatik.tu-muenchen.de [131.159.42.129]) by mailrelay1.informatik.tu-muenchen.de (Postfix) with ESMTP id 69AFA7942; Thu, 28 Nov 2002 09:55:23 +0100 (MET) Received: by atrbg11.informatik.tu-muenchen.de (Postfix, from userid 20455) id F281413735; Thu, 28 Nov 2002 09:55:22 +0100 (CET) Date: Thu, 28 Nov 2002 09:55:22 +0100 From: Daniel Lang To: Poul-Henning Kamp Cc: freebsd-hackers@FreeBSD.ORG, chopin@sgh.waw.pl Subject: Re: strange coredump in malloc_bytes()/libc in 4.7p2 Message-ID: <20021128085522.GA64864@atrbg11.informatik.tu-muenchen.de> References: <20021126131438.GC60278@atrbg11.informatik.tu-muenchen.de> <99290.1038340689@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="SLDf9lqlvOQaIe6s" Content-Disposition: inline In-Reply-To: <99290.1038340689@critter.freebsd.dk> X-Geek: GCS/CC d-- s: a- C++$ UBS++++$ P+++$ L- E-(---) W+++(--) N++ o K w--- O? M? V? PS+(++) PE--(+) Y+ PGP+ t++ 5+++ X R+(-) tv+ b+ DI++ D++ G++ e+++ h---(-) r++>+++ y+ User-Agent: Mutt/1.5.1i Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi Poul-Henning, Poul-Henning Kamp wrote on Tue, Nov 26, 2002 at 08:58:09PM +0100: [..] > I think we can more or less conclude that something has trashed your > memory. > > I'd suggest you try to run your program with ElectricFence or similar. [..] I found the problem. It seems to result in a series of unfortunate events. Although some reasons can be blamed on the application (ircd), I believe I've found a possible problem with malloc(), as well. I will explain in detail. First I installed EFence and linked ircd against it, just to find, that ircd with -lefence died under the debugger with "Program exited with 0377" (or similar). I would have expected EF to SEGV, if a memory barrier was crossed... Single-Stepping through the code inside of EFence's allocator, I finally came across a mmap() call, that returned 0xffffffff, which is, as it seems MAP_FAILED. For some unknown reason, the corresponding error message was not printed, but this brought me on the right track. Going up the stack frames, it became clear, that a ridicilous amount of memory was tried to be allocated, finally leading to this error. The amount of memory to be allocated came from a broken tuning file, that is read on startup. *sigh* So here is ircd to blame, not sanity-checking the tune-file. I will discuss this with the developers separately. But since mmap() returned a failure, I was curious, why malloc() did not cause a similar error, if ircd was run without EFence. First I checked, that the return of the call to malloc() is 0x0, as it should be. This was the case, and this case is also handled in ircd's code. The process is not aborted, though. When I continued in the debugger, the process dies soon with the strange error in isatty(). Again I dug up a libc with symbols. And stepped through malloc() as well. Here is what I found: malloc is called with an argument of -149139900, this results in 4145827396 Bytes (interpreted as unsigned value). This value is pageround() and shifted resulting in 1012165 pages This value is passed to map_pages(): [..] static void * map_pages(size_t pages) { caddr_t result, tail; result = (caddr_t)pageround((u_long)sbrk(0)); tail = result + (pages << malloc_pageshift); if (brk(tail)) { #ifdef EXTRA_SANITY wrterror("(ES): map_pages fails\n"); #endif /* EXTRA_SANITY */ return 0; } [..] passing such a value to map_pages seems to result in an overflow in the calculation of "tail": (gdb) p result $18 = 0x8f22000 (gdb) p tail $17 = 0xe7000 I understand, that result is the current 'break', the upper border of the process' data segment, and tail should be the future upper border, increased by the amount of pages. brk() just sets the new value, but in this case of overflow, does not increase the value, but _lowers_ it, to some utterly wrong value. The call to brk() succeeds! This seems to me _the_ place where the memory corruption actually happens. A sanity check for the overflow may not be wrong, I'll attach a patch. However, I do not know, if this can be considered a bug. The check after the call to brk(): if ((last_index+1) >= malloc_ninfo && !extend_pgdir(last_index)) fails, and malloc() returns 0x0 after all. Well, I'm not sure if programs are expected to exit immediately after a malloc() fails, but I think, they are not necessarily. Finally I included malloc_options="X" into the code, and, yes, the program exited with abort() at a much more sensible location. I did not remember malloc options until today, alas. :-/ Ok, so far from me. Any comments about my discovery and patch appreciated. Best regards, Daniel -- IRCnet: Mr-Spock - All your .sigs are belong to us - Daniel Lang * dl@leo.org * +49 89 289 18532 * http://www.leo.org/~dl/ --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="malloc.patch" --- src/lib/libc/stdlib/malloc.c.orig Thu Nov 28 09:51:09 2002 +++ src/lib/libc/stdlib/malloc.c Thu Nov 28 09:53:00 2002 @@ -307,6 +307,14 @@ result = (caddr_t)pageround((u_long)sbrk(0)); tail = result + (pages << malloc_pageshift); + /* check for overflow */ + if(tail < result) { +#ifdef EXTRA_SANITY + wrterror("(ES): overflow in map_pages; failed\n"); +#endif /* EXTRA_SANITY */ + return 0; + } + if (brk(tail)) { #ifdef EXTRA_SANITY wrterror("(ES): map_pages fails\n"); --SLDf9lqlvOQaIe6s-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message