From owner-freebsd-hackers@FreeBSD.ORG Tue Aug 15 08:21:20 2006 Return-Path: X-Original-To: freebsd-hackers@freebsd.org Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E2BAB16A4DA for ; Tue, 15 Aug 2006 08:21:19 +0000 (UTC) (envelope-from admin@intron.ac) Received: from intron.ac (unknown [210.51.165.237]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8E9F43D46 for ; Tue, 15 Aug 2006 08:21:18 +0000 (GMT) (envelope-from admin@intron.ac) Received: from localhost (localhost [127.0.0.1]) (uid 1003) by intron.ac with local; Tue, 15 Aug 2006 16:21:17 +0800 id 00102C05.44E183FD.00006C81 References: <20060814231504.GB69362@lor.one-eyed-alien.net> <20060815023505.N1988@kushnir1.kiev.ua> In-Reply-To: <20060815023505.N1988@kushnir1.kiev.ua> From: "Intron" To: Vladimir Kushnir Date: Tue, 15 Aug 2006 16:21:17 +0800 Mime-Version: 1.0 Content-Type: text/plain; charset="gb2312"; format=flowed Content-Transfer-Encoding: 7bit Message-ID: Cc: freebsd-hackers@freebsd.org Subject: Re: The optimization of malloc(3): FreeBSD vs GNU libc X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Aug 2006 08:21:20 -0000 Vladimir Kushnir wrote: > Sorry for intrusion. > > On Mon, 14 Aug 2006, Brooks Davis wrote: > >> On Tue, Aug 15, 2006 at 07:10:47AM +0800, Intron wrote: >>> One day, a friend told me that his program was 3 times slower under >>> FreeBSD 6.1 than under GNU/Linux (from Redhat 7.2 to Fedora Core 5). >>> I was astonished by the real repeatable performance difference on >>> AMD Athlon XP 2500+ (1.8GHz, 512KB L2 Cache). >>> >>> After hacking, I found that the problem is nested in malloc(3) of >>> FreeBSD libc. >>> >>> Download the testing program: http://ftp.intron.ac/tmp/fdtd.tar.bz2 >>> >>> You may try to compile the program WITHOUT the macro "MY_MALLOC" >>> defined (in Makefile) to use malloc(3) provided by FreeBSD 6.1. >>> Then, time the running of the binary (on Athlon XP 2500+): >>> >>> #/usr/bin/time ./fdtd.FreeBSD 500 500 1000 >>> ... >>> 165.24 real 164.19 user 0.02 sys >>> >>> Please try to recompile the program (Remember to "make clean") >>> WITH the macro "MY_MALLOC" defined (in Makefile) to use my own >>> simple implementation of malloc(3) (i.e. my_malloc() in cal.c). >>> And time the running again: >>> >>> #/usr/bin/time ./fdtd.FreeBSD 500 500 1000 >>> ... >>> 50.41 real 49.95 user 0.04 sys >>> >>> You may repeat this testing again and again. >>> >>> I guess this kind of performance difference comes from: >>> >>> 1. His program uses malloc(3) to obtain so many small memory blocks. >>> >>> 2. In this case, FreeBSD malloc(3) obtains small memory blocks from >>> kernel and pass them to application. >>> >>> But malloc(3) of GNU libc obtains large memory blocks from kernel >>> and splits & reallocates them in small blocks to application. >>> >>> You may verify my judgement with truss(1). >>> >>> 3. The way of FreeBSD malloc(3) makes VM page mapping too chaotic, which >>> reduces the efficiency of CPU L2 Cache. In contrast, my my_malloc() >>> simulates the behavior of GNU libc malloc(3) partially and avoids >>> the over-chaos. >>> >>> Callgrind is broken under FreeBSD, or I will verify my guess with it. >>> >>> I have also verified the program on Intel Pentium 4 511 (2.8GHz, 1MB >>> L2 cache, running FreeBSD 6.1 i386 though this CPU supports EM64T) >>> >>>> /usr/bin/time ./fdtd.FreeBSD 500 500 1000 >>> ... >>> 185.30 real 184.28 user 0.02 sys >>> >>>> /usr/bin/time ./fdtd.FreeBSD 500 500 1000 >>> ... >>> 36.31 real 35.94 user 0.03 sys >>> >>> NOTE: you probably cannot see the performance difference on CPU with >>> small L2 cache such as Intel Celeron 1.7GHz with 128 KB L2 Cache. >> >> In CURRENT we've replaced phkmalloc with jemalloc. It would be useful >> to see how this benchmark performs with that. I believe it does similar >> things. >> >> -- Brooke >> > On -CURENT amd64 (Athlon64 3000+, 512k L2 cache): > > With jemalloc (without MY_MALLOS): > ~/fdtd> /usr/bin/time ./fdtd.FreeBSD 500 500 1000 > ... > 116.34 real 113.69 user 0.00 sys > > With MY_MALLOC: > ~/fdtd> /usr/bin/time ./fdtd.FreeBSD 500 500 1000 > ... > 45.30 real 44.29 user 0.00 sys > > Regards, > Vladimir > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" How long has it been since you CVSup-ed your source tree last time? These days the source tree is broken in building frequently, which makes 7.0-CURRENT binaries on some users' computers out of date. ------------------------------------------------------------------------ From Beijing, China