From owner-freebsd-hackers Tue Jul 4 09:18:27 1995 Return-Path: hackers-owner Received: (from majordom@localhost) by freefall.cdrom.com (8.6.10/8.6.6) id JAA29293 for hackers-outgoing; Tue, 4 Jul 1995 09:18:27 -0700 Received: from elbe.desy.de (elbe.desy.de [131.169.82.208]) by freefall.cdrom.com (8.6.10/8.6.6) with SMTP id JAA29287 for ; Tue, 4 Jul 1995 09:18:17 -0700 From: Lars Gerhard Kuehl Date: Tue, 4 Jul 95 18:17:23 +0200 Message-Id: <9507041617.AA03378@elbe.desy.de> To: Kai.Vorma@hut.fi Subject: Re: dlmalloc Cc: FreeBSD-hackers@freefall.cdrom.com Sender: hackers-owner@FreeBSD.org Precedence: bulk > I also use it with XFree86-3.1.1 (X-server and binaries) with very > good success. Performance is far better than with system malloc > altough GNU-malloc is sometimes better still. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Does that mean your Xserver is faster with gnumalloc() rather than with standard malloc(). I've always encountered just the opposite. Even in connexion with the X-server memory leak diskussion I have spent some time in investigating the behavier of gnu malloc() and standard malloc(). For example I have written a short program that malloc()s in the following manner: (1a: 1page)(1b: 1page)(free 1a) (2a: 2pages)(2b: 1page)(free 2a)... (To force page allocation, the first word of each malloc()ed page is initialized before free().) After each free ps is exec()ed to show the program's VSZ. The intention was, that malloc can't reuse the before allocated pages, because virtual adresses of arrays must be continuous. With standard malloc() VSZ grows as expected about n*(n+1)/2+n pages in n loops, with gnu malloc about n pages. But there is another pretty difference: standard malloc(): % cumulative self self total time seconds seconds calls ms/call ms/call name 69.4 2.78 2.78 1 2780.27 3750.00 _main [2] ^^^^^^^ ^^^^^^^ 15.2 3.39 0.61 1001 0.61 0.61 _fork [3] 6.2 3.64 0.25 mcount (31) 2.8 3.75 0.11 2001 0.06 0.06 _malloc [5] 1.4 3.81 0.06 1001 0.06 0.06 _wait4 [9] 1.4 3.86 0.05 2001 0.03 0.06 _vfprintf [6] 1.3 3.92 0.05 1001 0.05 0.18 _execps [4] 0.5 3.94 0.02 1000 0.02 0.02 _write [14] 0.3 3.95 0.01 2001 0.01 0.02 ___sfvwrite [11] 0.2 3.96 0.01 1001 0.01 0.01 _wait [15] 0.1 3.96 0.01 1001 0.01 0.01 _getpid [16] --- cut --- gnu malloc(): % cumulative self self total time seconds seconds calls ms/call ms/call name 91.5 12.54 12.54 1 12541.02 13208.15 _main [2] ^^^^^^^^ ^^^^^^^^ 2.5 12.88 0.34 1001 0.34 0.34 _fork [3] 1.9 13.14 0.26 mcount (31) 1.7 13.37 0.23 2992 0.08 0.08 _sbrk [5] 0.4 13.43 0.06 2001 0.03 0.06 _vfprintf [7] 0.4 13.49 0.06 1001 0.06 0.26 _execps [4] 0.3 13.53 0.05 1001 0.05 0.05 _wait4 [13] 0.3 13.58 0.05 _malloc [12] 0.2 13.61 0.03 1000 0.03 0.03 _write [16] 0.1 13.63 0.02 2001 0.01 0.03 ___sfvwrite [11] 0.1 13.65 0.02 __free_internal [6] --- cut --- n was here 1000, but not only for this value gnumalloc() is much slower. Until now I've neither checked it with dlmalloc() nor compiled a profiled kernel (for it's very likely a general VM problem independent of the particular malloc implementation, the time is obviously spent in page allocation). I'll do that probably not before next weekend. Lars