Date: Thu, 16 Mar 2006 17:46:50 -0800 From: Jason Evans <jasone@FreeBSD.org> To: bert hubert <bert.hubert@netherlabs.nl> Cc: freebsd-hackers@freebsd.org Subject: Re: unsatisfying c++/boost::multi_index_container::erase performance on at least FreeBSD 6.0 Message-ID: <441A150A.6000003@FreeBSD.org> In-Reply-To: <20060316224912.GA14905@outpost.ds9a.nl> References: <20060316224912.GA14905@outpost.ds9a.nl>
next in thread | previous in thread | raw e-mail | index | archive | help
bert hubert wrote: > Dear FreeBSD hackers, > > I'm working on improving the PowerDNS recursor for a big FreeBSD-loving > internet provider in The Netherlands and I am hitting some snags. I also > hope this is the appropriate list to share my concerns. > > Pruning the cache is very very slow on the providers FreeBSD 6.0 x86 systems > whereas it flies on other operating systems. > > I've managed to boil down the problem to the code found on > http://ds9a.nl/tmp/cache-test.cc which can be compiled with: > 'g++ -O3 -I/usr/local/include cache-test.cc -o cache-test' after installing > Boost from the ports. > > The problem exists both with the system compiler and with a self-compiled > g++ 4.1. > > Here are some typical timings: > $ ./cache-test > Creating.. > Copying 499950 nodes > 100 345 usec 3.45 usec/erase > 300 3298 usec 10.99 usec/erase > 500 8749 usec 17.50 usec/erase > 700 72702 usec 103.86 usec/erase > 900 46521 usec 51.69 usec/erase > > On another operating system with almost the same cpu: > > $ ./cache-test > Creating.. > Copying 499950 nodes > 100 62 usec 0.62 usec/erase > 300 187 usec 0.62 usec/erase > 500 347 usec 0.69 usec/erase > 700 419 usec 0.60 usec/erase > 900 575 usec 0.64 usec/erase > > I've toyed with MALLOC_OPTIONS, especially the >> options, I've tried > GLIBCXX_FORCE_NEW, I've tried specifying a different STL allocator in the > c++ code, it all doesn't change a thing. > > A quick gprof profile shows a tremendous number of calls to 'ifree' but that > may be due to the copying of the container going on between test runs. > > Any help would be very appreciated as I am all out of clues. > > Thanks. I ran cache-test on -current using phkmalloc and a couple of different versions of jemalloc. jemalloc does not appear to have the same issue for this test. It isn't obvious to me why phkmalloc is performing so poorly, but I think you can assume that this is a malloc performance problem. The following jemalloc results were run with NO_MALLOC_EXTRAS defined. I included the patch results because I expect to commit the patch this week. phkmalloc and jemalloc have similar memory usage, but jemalloc is substantially faster. The jemalloc patch uses substantially less memory than either phkmalloc or jemalloc. Jason ------- phkmalloc: ----------------------------------------------------- onyx:~> MALLOC_OPTIONS=aj LD_PRELOAD=/tmp/phkmalloc/libc/libc.so.6 =time -l ./cache-test Creating.. Copying 499950 nodes 100 501 usec 5.01 usec/erase 300 53183 usec 177.28 usec/erase 500 5491 usec 10.98 usec/erase 700 158989 usec 227.13 usec/erase 900 47491 usec 52.77 usec/erase 1100 324948 usec 295.41 usec/erase 1300 106480 usec 81.91 usec/erase 1500 522414 usec 348.28 usec/erase 1700 155604 usec 91.53 usec/erase 1900 685235 usec 360.65 usec/erase 2100 230939 usec 109.97 usec/erase 2300 860083 usec 373.95 usec/erase 2500 234910 usec 93.96 usec/erase 2700 1226310 usec 454.19 usec/erase 2900 205739 usec 70.94 usec/erase 3100 1379395 usec 444.97 usec/erase 3300 296925 usec 89.98 usec/erase 3500 1620705 usec 463.06 usec/erase 3700 312343 usec 84.42 usec/erase 3900 1835125 usec 470.54 usec/erase 4100 306443 usec 74.74 usec/erase 4300 1805999 usec 420.00 usec/erase 4500 323179 usec 71.82 usec/erase 4700 1593007 usec 338.94 usec/erase 4900 316249 usec 64.54 usec/erase 495.53 real 494.29 user 1.17 sys 279240 maximum resident set size 60 average shared memory size 274524 average unshared data size 128 average unshared stack size 78238 page reclaims 1 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 4 voluntary context switches 6492 involuntary context switches ------- jemalloc (-current): ------------------------------------------- onyx:~> MALLOC_OPTIONS=aj LD_PRELOAD=/tmp/jemalloc/libc/libc.so.6 =time -l ./cache-test Creating.. Copying 499950 nodes 100 281 usec 2.81 usec/erase 300 586 usec 1.95 usec/erase 500 1008 usec 2.02 usec/erase 700 973 usec 1.39 usec/erase 900 1489 usec 1.65 usec/erase 1100 2269 usec 2.06 usec/erase 1300 2493 usec 1.92 usec/erase 1500 3337 usec 2.22 usec/erase 1700 3815 usec 2.24 usec/erase 1900 3511 usec 1.85 usec/erase 2100 4493 usec 2.14 usec/erase 2300 4235 usec 1.84 usec/erase 2500 6043 usec 2.42 usec/erase 2700 5474 usec 2.03 usec/erase 2900 7670 usec 2.64 usec/erase 3100 6104 usec 1.97 usec/erase 3300 10923 usec 3.31 usec/erase 3500 4560 usec 1.30 usec/erase 3700 9998 usec 2.70 usec/erase 3900 8023 usec 2.06 usec/erase 4100 15031 usec 3.67 usec/erase 4300 5588 usec 1.30 usec/erase 4500 15490 usec 3.44 usec/erase 4700 6544 usec 1.39 usec/erase 4900 14565 usec 2.97 usec/erase 38.58 real 37.98 user 0.57 sys 275752 maximum resident set size 60 average shared memory size 12 average unshared data size 128 average unshared stack size 68494 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 1 voluntary context switches 1180 involuntary context switches ------- jemalloc (patch): ---------------------------------------------- (http://people.freebsd.org/~jasone/jemalloc/patches/jemalloc_20060315a.diff) onyx:~> MALLOC_OPTIONS=aj LD_PRELOAD=/usr/obj/usr/src/lib/libc/libc.so.6 =time -l ./cache-test Creating.. Copying 499950 nodes 100 232 usec 2.32 usec/erase 300 912 usec 3.04 usec/erase 500 2514 usec 5.03 usec/erase 700 2008 usec 2.87 usec/erase 900 3255 usec 3.62 usec/erase 1100 2931 usec 2.66 usec/erase 1300 4010 usec 3.08 usec/erase 1500 3486 usec 2.32 usec/erase 1700 4675 usec 2.75 usec/erase 1900 2992 usec 1.57 usec/erase 2100 2417 usec 1.15 usec/erase 2300 4986 usec 2.17 usec/erase 2500 4000 usec 1.60 usec/erase 2700 5990 usec 2.22 usec/erase 2900 3661 usec 1.26 usec/erase 3100 4702 usec 1.52 usec/erase 3300 5934 usec 1.80 usec/erase 3500 7999 usec 2.29 usec/erase 3700 5998 usec 1.62 usec/erase 3900 6489 usec 1.66 usec/erase 4100 6997 usec 1.71 usec/erase 4300 7965 usec 1.85 usec/erase 4500 7849 usec 1.74 usec/erase 4700 8456 usec 1.80 usec/erase 4900 7814 usec 1.59 usec/erase 37.13 real 35.86 user 1.22 sys 222976 maximum resident set size 59 average shared memory size 11 average unshared data size 127 average unshared stack size 104136 page reclaims 0 page faults 0 swaps 0 block input operations 0 block output operations 0 messages sent 0 messages received 0 signals received 2 voluntary context switches 1162 involuntary context switches
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?441A150A.6000003>