From owner-freebsd-current@FreeBSD.ORG Fri Jan 20 07:10:20 2006 Return-Path: X-Original-To: freebsd-current@freebsd.org Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CD66116A41F for ; Fri, 20 Jan 2006 07:10:20 +0000 (GMT) (envelope-from jasone@FreeBSD.org) Received: from lh.synack.net (lh.synack.net [204.152.188.37]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7B13543D45 for ; Fri, 20 Jan 2006 07:10:20 +0000 (GMT) (envelope-from jasone@FreeBSD.org) Received: by lh.synack.net (Postfix, from userid 100) id 3DB6A5E48D5; Thu, 19 Jan 2006 23:10:20 -0800 (PST) Received: from [192.168.168.203] (moscow-cuda-gen2-68-64-60-20.losaca.adelphia.net [68.64.60.20]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by lh.synack.net (Postfix) with ESMTP id C05425E48D5 for ; Thu, 19 Jan 2006 23:10:19 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: <6BD97F93-5E85-4A5A-8751-DC0C0382B916@FreeBSD.org> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: freebsd-current@freebsd.org From: Jason Evans Date: Thu, 19 Jan 2006 23:10:19 -0800 X-Mailer: Apple Mail (2.746.2) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on lh.synack.net X-Spam-Level: * X-Spam-Status: No, score=1.8 required=5.0 tests=RCVD_IN_NJABL_DUL, RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Subject: Typical malloc-related application bugs X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jan 2006 07:10:21 -0000 Overall, the malloc changeover has been pretty uneventful. Now that jemalloc has seen a bit wider exposure, I thought it might be useful to summarize the types of application bugs that it has been uncovering. -------- 1) Missing function prototypes for functions that return pointers. If a function prototype is missing, the compiler provides a default return type of int, but since amd64 is an LP64 architecture, this means that eight-byte pointers get cast to four-byte integers, even if they are then immediately stored in pointer variables. This pointer truncation isn't usually a problem with phkmalloc since it uses sbrk() to allocate space. However, jemalloc does not use sbrk() at all on amd64, so pretty much all malloc'ed objects are high enough in memory that more than 32 bits are required to store pointers to them. A simple way to determine whether this is a problem for an app is to #define USE_BRK in malloc.c, rebuild libc, then run your app something like: LD_PRELOAD=/usr/obj/usr/src/lib/libc/libc.so.6 startx Incidentally, this appears to be why xorg-server doesn't work on amd64. 2) Out-of-bounds writes. Lots of programs have been found to write past the end of the space they allocate. At the moment, jemalloc's redzone code is enabled, so these errors are causing messages to stderr that look like: ifconfig: (malloc) Corrupted redzone 1 byte after 0xa000150 (size 18) (0x0) In at least one case (running f2c while building the math/arpack port), these overruns would have caused actual malloc data structure corruption, had redzones not been enabled. 3) Invalid alignment assumptions. This class of bug has been less common than the first two, but it is definitely still an issue. The 'Q' and 'k' malloc configuration options provide an effective, if not very efficient, mechanism for digging into alignment issues. -------- Thanks for all the help so far with diagnosing various bugs, both in apps and in malloc. (Yes, there were some malloc bugs uncovered). We should expect to keep hitting pointer truncation bugs on amd64 for some time to come, but otherwise, I think we're pretty much out of the woods at this point. Naturally, if you need help diagnosing problems, I'll continue doing my best to help. Thanks, Jason