From owner-freebsd-performance@FreeBSD.ORG Tue Apr 4 16:43:51 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B8CB916A401 for ; Tue, 4 Apr 2006 16:43:51 +0000 (UTC) (envelope-from hadara@bsd.ee) Received: from mail.neti.ee (smtp-out-1.neti.ee [194.126.101.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4F76843D6B for ; Tue, 4 Apr 2006 16:43:47 +0000 (GMT) (envelope-from hadara@bsd.ee) Received: from nat-155.nat (test.estpak.ee [194.126.115.47]) by Relayhost1.neti.ee (Postfix) with ESMTP id 99DFA17745 for ; Tue, 4 Apr 2006 19:43:46 +0300 (EEST) From: Sven Petai To: freebsd-performance@freebsd.org Date: Tue, 4 Apr 2006 19:42:17 +0300 User-Agent: KMail/1.9.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604041942.18767.hadara@bsd.ee> X-Virus-Scanned: by amavisd-new-2.2.1 (20041222) (Debian) at neti.ee Subject: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Apr 2006 16:43:51 -0000 hi Before I begin, let me just say that I'm probably aware most of the threads about mysql performance in various fbsd lists over last couple of years, so please let's not consentrate on the usual points made over and over again like how filesystems are mounted under linux, how fast time() is or how various combinations of scheduler/threding library/compiler flags give you ~5-10% better performance. It's very unlikely that any of these reasons, or even all of them together can explain performance differences of 2-3 * so now a little bit of the backround... I usually use MySQL benchmark called super-smack as one of the benchmarks on all the new machines to get a general feeling of the servers performance. I certainly agree that the default smack workloads are far too simple to say much about actual production performance, but still... better than nothing... In general 2.4Ghz amd64 UP box (6.1 betaX) can do about 17400 q/s with select-smack+4bsd+thr combination and 4300 q/s with update-smack+4bsd+thr on dualcore 2Ghz opteron (6.1 prerelease) the results are: 20000 q/s with select-smack+4bsd+thr and 4500 q/s with update-smack+4bsd+thr performance for update-smack seems to be always 4XXX q/s, no matter how many CPUs the box has or what kind or raid controller/disks are used (i have tested on about 8 rather different machines). I have no idea if IO on all the servers I have tried really maxes out at this point or is there some bottleneck in UFS. select-smack performance gains on dualcore are not quite as good as one might expect, but then again that dualcore box uses ECC memory which is probably somewhat slower because of the checksum calculations, and synchronisation has some overhead too... Anyway all in all I'm more or less happy with these results, even though linux will do about twise as much selects on the same hardware. Today I had a chance to test 4 * 2Ghz dualcore opteron machine, so this machine has 8 cores in total and 8G of RAM. Now, on that server I get: 11000 q/s for select-smack+4bsd+thr combination (with KSE it's around 6000 q/s, ule+thr gives somewhere around 12000 q/s) 4100 q/s for update-smack+4bsd+thr So the 8 core machine got almost 2* worse result for select than UP server. After some tinkering I found out that renicing mysqld to -5 will make it push out 21000 q/s (4bsd, thr), so I suspect part of the problem is in the scheduling - probably super-smack with it's 100 processes gets just a lot more CPU time otherwise than mysql with it's 100 threads servicing them. But anyway even this result is still only about equal in performance to what I get from dualcore machine. As I ran out of good (macro)tuning ideas at this point, and wanted to make sure higher scores are indeed achievable, I tried Linux on the same hardware. Here are the results for same tests on Suse enterprise linux 9 (2.6.5-7.97-smp): 76857 q/s for select-smack 10050 q/s for update-smack the mysql configuration was identical to the one I used under freebsd (my-huge). This Suse uses ReiserFS, but I have no idea about what kind of FS guarantees it provides, didn't see any sync/async stuff in the mount output. I also repeated the tests on identical box that had Fedora installed (2.6.9-22-ELsmp) and used ext3'fs. select-smack results were obviously almost the same as it doesn't touch the FS, update was about 8000 q/s. I'm relativelly sure that this kind of huge performance differences can't be explained by mere speed difference of time(), I haven't yet tested phk'd and roberts timer hacks, but at some point in time I rewrote mysql's timing code to completelly avoid any calls to time() by keeping internal timestamp that was updated from TSC reg. value. It was certainly very ugly and imprecise, but worked well enough since mysql uses these code paths mainly for statistics and for setting various safeguard timeouts. Even with ~90% time() calls removed the performance still didn't get measurably better. Of course it's possible that I fucked up somehow, so if someone has tested roberts and phk's changes then it would be certainly nice to hear about your results. To make the long story short - does anyone have any good ideas about where might the bottleneck and how to debug it ? PS Here's some system/test information: super-smack was used with concurrency of 100 and reqs. set to 10000 it was running on the same machine as the mysqld and connections were done over local socket. timer: acpi-fast in all the cases mysql: 4.1.18_2 from ports, table type is myisam mysql configuration file: http://bsd.ee/~hadara/debug/mysql3/2way/my.cnf in general it's just my-huge.cnf from mysql distribution, with increased max_connections kernel config is GENERIC-SMP (no it doesn't have WITNESS enabled) == 4 * dualcore opteron ==: vmstat 1, during select-smack test: http://bsd.ee/~hadara/debug/mysql3/8way/vmstat.txt dmesg: http://bsd.ee/~hadara/debug/mysql3/8way/dmesg.boot sysctl -a: http://bsd.ee/~hadara/debug/mysql3/8way/sysctl.txt == 1 * dualcore opteron ==: vmstat 1, during select-smack test: http://bsd.ee/~hadara/debug/mysql3/2way/vmstat.txt dmesg: http://bsd.ee/~hadara/debug/mysql3/2way/dmesg.boot sysctl -a: http://bsd.ee/~hadara/debug/mysql3/2way/sysctl.txt From owner-freebsd-performance@FreeBSD.ORG Wed Apr 5 05:31:39 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id C57CC16A400 for ; Wed, 5 Apr 2006 05:31:38 +0000 (UTC) (envelope-from yfxu@corp.netease.com) From: David Xu Organization: netease.com To: freebsd-performance@freebsd.org Date: Wed, 5 Apr 2006 13:31:23 +0800 User-Agent: KMail/1.8.2 References: <200604041942.18767.hadara@bsd.ee> In-Reply-To: <200604041942.18767.hadara@bsd.ee> MIME-Version: 1.0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200604051331.23826.yfxu@corp.netease.com> X-Mailman-Approved-At: Wed, 05 Apr 2006 12:34:10 +0000 Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Apr 2006 05:31:39 -0000 =D4=DA Wednesday 05 April 2006 00:42=A3=ACSven Petai =D0=B4=B5=C0=A3=BA >=20 > hi >=20 > Before I begin, let me just say that I'm probably aware most of the threa= ds=20 > about mysql performance in various fbsd lists over last couple of years, = so=20 > please let's not consentrate on the usual points made over and over again= =20 > like how filesystems are mounted under linux, how fast time() is or how=20 > various combinations of scheduler/threding library/compiler flags give yo= u=20 > ~5-10% better performance. It's very unlikely that any of these reasons, = or=20 > even all of them together can explain performance differences of 2-3 *=20 >=20 Can you disable log-bin option in my.cnf to see if it is a FS bottleneck when you are running update-smack ? please run Linux and FreeBSD with same hardware and my.cnf configuration, thanks. I know this is not very right, but it can be used to narrow down some kernel performance problem. Regards, David Xu From owner-freebsd-performance@FreeBSD.ORG Wed Apr 5 14:26:07 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0242816A401 for ; Wed, 5 Apr 2006 14:26:07 +0000 (UTC) (envelope-from hadara@bsd.ee) Received: from mail.neti.ee (smtp-out-1.neti.ee [194.126.101.98]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3935D43D45 for ; Wed, 5 Apr 2006 14:26:05 +0000 (GMT) (envelope-from hadara@bsd.ee) Received: from nat-155.nat (test.estpak.ee [194.126.115.47]) by Relayhost2.neti.ee (Postfix) with ESMTP id 47F9B14063 for ; Wed, 5 Apr 2006 17:25:58 +0300 (EEST) From: Sven Petai To: freebsd-performance@freebsd.org Date: Wed, 5 Apr 2006 17:30:51 +0300 User-Agent: KMail/1.9.1 References: <200604041942.18767.hadara@bsd.ee> <200604051331.23826.yfxu@corp.netease.com> In-Reply-To: <200604051331.23826.yfxu@corp.netease.com> MIME-Version: 1.0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604051730.52014.hadara@bsd.ee> X-Virus-Scanned: by amavisd-new-2.2.1 (20041222) (Debian) at neti.ee Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Apr 2006 14:26:07 -0000 On Wednesday 05 April 2006 08:31, David Xu wrote: > > Can you disable log-bin option in my.cnf to see if it is a FS bottleneck > when you are running update-smack ? please run Linux and FreeBSD > with same hardware and my.cnf configuration, thanks. > I know this is not very right, but it can be used to narrow down some > kernel performance problem. I can't test disabling log-bin option right now since those 8 core systems were shipped out to client today, but I will probably get access to some identical servers next week, so I will then test this and whatever other suggestions you people can come up with. Until then can we maybe consentrate on * why does 8 core machine get so awful select score without renicing mysqld * why is select result on linux >65000 q/s while fbsd can do only about 21000 q/s But the bencmark results that I stated for 8 core machines were done on 4 _identical_ servers, with following operating system installations: server 1 - suse enterprise linux 9 with kernel 2.6.5-7.97-smp, mysql 4.0.18-32.1, reiserfs server 2 - Fedora core with kernel 2.6.9-22-ELsmp, mysql 4.1.12, ext3 server 3 & server 4 - freebsd 6.1 beta 4,generic-smp kernel, mysql 4.1.12_2 from ports, ufs2 + softupdates, libthr While it was not the very _same_ hardware, the machines were absolutelly identical in every aspect and fbsd results on 2 servers were identical within the limits of measurement error so it's rather unlikely that some hardware glitch can be blamed for the differences. The mysql configuration file was also identical - default my-huge.cnf with max_connections change @ http://bsd.ee/~hadara/debug/mysql3/2way/my.cnf Hardware spec of those servers was: motherboard: Thunder K8QSD Pro hdd: scsi seagate cheetah 10K7 ram: 8 * 3200 CL3 kingston ECC 1G cpu: 4 * opteron 870 (2Ghz dualcore) Results for other UP and dual machines that I mentioned were given just to give some general feeling about scalability. So just to state the 8 core results in a more concise manner: smack suse fedora fbsd 6.1b4 ----------------------------------------------------- select 76857 67000 21000 update 10047 8072 4100 From owner-freebsd-performance@FreeBSD.ORG Wed Apr 5 17:04:16 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2446016A41F for ; Wed, 5 Apr 2006 17:04:16 +0000 (UTC) (envelope-from killing@multiplay.co.uk) Received: from multiplay.co.uk (core6.multiplay.co.uk [85.236.96.23]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3017143D53 for ; Wed, 5 Apr 2006 17:04:14 +0000 (GMT) (envelope-from killing@multiplay.co.uk) Received: from vader ([212.135.219.179]) by multiplay.co.uk (multiplay.co.uk [85.236.96.23]) (MDaemon.PRO.v8.1.3.R) with ESMTP id md50002444401.msg for ; Wed, 05 Apr 2006 18:03:33 +0100 Message-ID: <021b01c658d2$de254a00$b3db87d4@multiplay.co.uk> From: "Steven Hartland" To: "Sven Petai" , References: <200604041942.18767.hadara@bsd.ee> Date: Wed, 5 Apr 2006 18:03:25 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0218_01C658DB.3C28D970" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2670 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2670 X-Spam-Processed: multiplay.co.uk, Wed, 05 Apr 2006 18:03:33 +0100 (not processed: message from valid local sender) X-MDRemoteIP: 212.135.219.179 X-Return-Path: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-performance@freebsd.org X-MDAV-Processed: multiplay.co.uk, Wed, 05 Apr 2006 18:03:35 +0100 Cc: Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Apr 2006 17:04:16 -0000 This is a multi-part message in MIME format. ------=_NextPart_000_0218_01C658DB.3C28D970 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit Looking at this on a dual box here ( waiting for the new MB for dual dual core ) All the time is spent processing super-smack and only 25% on mysqld. Even dropping to 10 clients a large portion is take by the clients. That said there is a lot that can be gained by using the tweaks out there i.e. ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + cpu_acct_2.patch Adding these jumps from a baseline: select_index 2000000 8 0 18624.60 to: select_index 2000000 5 0 29942.10 The biggest increases coming from libthr ( thanks DavidXu ) and the ULE scheduler. [log] == 4BSD + libpthread + ACPI-Fast == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=46ms min=6ms avg= 25ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 8 0 18624.60 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=5ms min=0ms avg= 1ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 23983.87 == 4BSD + libthr + ACPI-Fast == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=107ms min=2ms avg= 45ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 13 0 22413.39 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=2ms min=1ms avg= 1ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 26841.07 == 4BSD + libthr + TSC == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=46ms min=1ms avg= 21ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 11 0 23428.03 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=2ms min=0ms avg= 1ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 26403.95 == ULE + libthr + TSC == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=41ms min=0ms avg= 23ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 5 0 28581.18 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=4ms min=0ms avg= 1ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 30128.44 == ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + cpu_acct_2.patch == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=27ms min=0ms avg= 14ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 5 0 29942.10 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=12ms min=0ms avg= 4ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 31057.52 == 4BSD + libthr + TSC + context_time.patch + cpu_acct_1.patch + cpu_acct_2.patch == super-smack -d mysql select-key.smack 100 10000 Query Barrel Report for client smacker1 connect: max=54ms min=20ms avg= 38ms from 100 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 9 0 24144.22 super-smack -d mysql select-key.smack 10 100000 Query Barrel Report for client smacker1 connect: max=2ms min=0ms avg= 1ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 2000000 0 0 27073.46 ** update test ** super-smack -d mysql update-select.smack 10 100000 Query Barrel Report for client smacker connect: max=3ms min=0ms avg= 0ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 1000000 1 0 6468.70 update_index 1000000 0 0 6468.70 [/log] Machine: Dual 244, 2Gb running FreeBSD 6.1-PRERELEASE (i386) Package install of mysql 4.0 Port install of super-smack Notes: No detectable disk activity thoughout the tests ULE scheduler breaks the output from top with everything showing as WCPU 0% in the 100 concurrency test and the numbers not adding up at all in 10 concurrency test or showing 0%. To get context_time.patch to work I needed the attached patch which is basically two failed chunks of: kern/kern_exit.c moved to kern/kern_thread.c Steve ----- Original Message ----- From: "Sven Petai" To: Sent: Tuesday, April 04, 2006 5:42 PM Subject: mysql performance on 4 * dualcore opteron > hi > > Before I begin, let me just say that I'm probably aware most of the threads > about mysql performance in various fbsd lists over last couple of years, so > please let's not consentrate on the usual points made over and over again > like how filesystems are mounted under linux, how fast time() is or how > various combinations of scheduler/threding library/compiler flags give you > ~5-10% better performance. It's very unlikely that any of these reasons, or > even all of them together can explain performance differences of 2-3 * > > so now a little bit of the backround... > I usually use MySQL benchmark called super-smack as one of the benchmarks on > all the new machines to get a general feeling of the servers performance. > I certainly agree that the default smack workloads are far too simple to say > much about actual production performance, but still... better than nothing... > > In general 2.4Ghz amd64 UP box (6.1 betaX) can do about > 17400 q/s with select-smack+4bsd+thr combination and > 4300 q/s with update-smack+4bsd+thr > > on dualcore 2Ghz opteron (6.1 prerelease) the results are: > 20000 q/s with select-smack+4bsd+thr and > 4500 q/s with update-smack+4bsd+thr > > performance for update-smack seems to be always 4XXX q/s, no matter how many > CPUs the box has or what kind or raid controller/disks are used (i have > tested on about 8 rather different machines). I have no idea if IO on all > the servers I have tried really maxes out at this point or is there some > bottleneck in UFS. > select-smack performance gains on dualcore are not quite as good as one might > expect, but then again that dualcore box uses ECC memory which is probably > somewhat slower because of the checksum calculations, and synchronisation has > some overhead too... > Anyway all in all I'm more or less happy with these results, even though linux > will do about twise as much selects on the same hardware. > > Today I had a chance to test 4 * 2Ghz dualcore opteron machine, so this > machine has 8 cores in total and 8G of RAM. > > Now, on that server I get: > 11000 q/s for select-smack+4bsd+thr combination (with KSE it's around 6000 > q/s, ule+thr gives somewhere around 12000 q/s) > 4100 q/s for update-smack+4bsd+thr > > So the 8 core machine got almost 2* worse result for select than UP server. > > After some tinkering I found out that renicing mysqld to -5 will make it push > out 21000 q/s (4bsd, thr), so I suspect part of the problem is in the > scheduling - probably super-smack with it's 100 processes gets just a lot > more CPU time otherwise than mysql with it's 100 threads servicing them. > But anyway even this result is still only about equal in performance to what I > get from dualcore machine. > > As I ran out of good (macro)tuning ideas at this point, and wanted to make > sure higher scores are indeed achievable, I tried Linux on the same hardware. > Here are the results for same tests on Suse enterprise linux 9 > (2.6.5-7.97-smp): > 76857 q/s for select-smack > 10050 q/s for update-smack > > the mysql configuration was identical to the one I used under freebsd > (my-huge). > This Suse uses ReiserFS, but I have no idea about what kind of FS guarantees > it provides, didn't see any sync/async stuff in the mount output. > I also repeated the tests on identical box that had Fedora installed > (2.6.9-22-ELsmp) and used ext3'fs. > select-smack results were obviously almost the same as it doesn't touch the > FS, update was about 8000 q/s. > > I'm relativelly sure that this kind of huge performance differences can't be > explained by mere speed difference of time(), I haven't yet tested phk'd and > roberts timer hacks, but at some point in time I rewrote mysql's timing code > to completelly avoid any calls to time() by keeping internal timestamp that > was updated from TSC reg. value. It was certainly very ugly and imprecise, > but worked well enough since mysql uses these code paths mainly for > statistics and for setting various safeguard timeouts. > Even with ~90% time() calls removed the performance still didn't get > measurably better. > Of course it's possible that I fucked up somehow, so if someone has tested > roberts and phk's changes then it would be certainly nice to hear about your > results. > > To make the long story short - does anyone have any good ideas about where > might the bottleneck and how to debug it ? > > PS > Here's some system/test information: > super-smack was used with concurrency of 100 and reqs. set to 10000 > it was running on the same machine as the mysqld and connections were done > over local socket. > > timer: acpi-fast in all the cases > mysql: 4.1.18_2 from ports, table type is myisam > mysql configuration file: > http://bsd.ee/~hadara/debug/mysql3/2way/my.cnf > in general it's just my-huge.cnf from mysql distribution, with increased > max_connections > > kernel config is GENERIC-SMP (no it doesn't have WITNESS enabled) > == 4 * dualcore opteron ==: > vmstat 1, during select-smack test: > http://bsd.ee/~hadara/debug/mysql3/8way/vmstat.txt > dmesg: > http://bsd.ee/~hadara/debug/mysql3/8way/dmesg.boot > sysctl -a: > http://bsd.ee/~hadara/debug/mysql3/8way/sysctl.txt > > == 1 * dualcore opteron ==: > vmstat 1, during select-smack test: > http://bsd.ee/~hadara/debug/mysql3/2way/vmstat.txt > dmesg: > http://bsd.ee/~hadara/debug/mysql3/2way/dmesg.boot > sysctl -a: > http://bsd.ee/~hadara/debug/mysql3/2way/sysctl.txt > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" > > ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone (023) 8024 3137 or return the E.mail to postmaster@multiplay.co.uk. ------=_NextPart_000_0218_01C658DB.3C28D970 Content-Type: application/octet-stream; name="kern_thread.c.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="kern_thread.c.patch" --- kern/kern_thread.c.orig Wed Apr 5 16:06:00 2006=0A= +++ kern/kern_thread.c Wed Apr 5 16:09:00 2006=0A= @@ -441,5 +441,5 @@=0A= thread_exit(void)=0A= {=0A= - struct bintime new_switchtime;=0A= + uint64_t new_switchtime;=0A= struct thread *td;=0A= struct proc *p;=0A= @@ -481,7 +481,6 @@=0A= =0A= /* Do the same timestamp bookkeeping that mi_switch() would do. */=0A= - binuptime(&new_switchtime);=0A= - bintime_add(&p->p_rux.rux_runtime, &new_switchtime);=0A= - bintime_sub(&p->p_rux.rux_runtime, PCPU_PTR(switchtime));=0A= + new_switchtime =3D cpu_ticks();=0A= + p->p_rux.rux_runtime +=3D (new_switchtime - PCPU_GET(switchtime));=0A= PCPU_SET(switchtime, new_switchtime);=0A= PCPU_SET(switchticks, ticks);=0A= ------=_NextPart_000_0218_01C658DB.3C28D970-- From owner-freebsd-performance@FreeBSD.ORG Wed Apr 5 18:52:30 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0674916A422 for ; Wed, 5 Apr 2006 18:52:30 +0000 (UTC) (envelope-from silby@silby.com) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.FreeBSD.org (Postfix) with SMTP id 7ED9843D70 for ; Wed, 5 Apr 2006 18:52:29 +0000 (GMT) (envelope-from silby@silby.com) Received: (qmail 96664 invoked from network); 5 Apr 2006 18:52:28 -0000 Received: from unknown (HELO localhost) (unknown) by unknown with SMTP; 5 Apr 2006 18:52:28 -0000 X-pair-Authenticated: 209.68.2.70 Date: Wed, 5 Apr 2006 13:52:26 -0500 (CDT) From: Mike Silbersack To: Steven Hartland Message-ID: <20060405134919.T16926@odysseus.silby.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-performance@freebsd.org Subject: mysql tests - one more thing to try X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 05 Apr 2006 18:52:30 -0000 I'm not subscribed to -performance, hence why this isn't a true reply... I noticed that Steve said: --- Looking at this on a dual box here ( waiting for the new MB for dual dual core ) All the time is spent processing super-smack and only 25% on mysqld. --- If you're willing to spend more time looking at this, I suggest that you run truss or ktrace on the super-smack processes. I did a small amount of mysql vs postgres vs firebird benchmarking two years ago for a class project, and noticed that mysql's results were showing the same phenomena - our test program was using more cpu than mysqld. I run truss on our test program and found that it was doing ONE BYTE READS from the socket, rather than something larger. I never had the time to see if the problem was fixed at a later time or not. You may wish to see if that same condition is still happening. Mike "Silby" Silbersack From owner-freebsd-performance@FreeBSD.ORG Thu Apr 6 00:39:47 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E814216A429 for ; Thu, 6 Apr 2006 00:39:46 +0000 (UTC) (envelope-from hadara@bsd.ee) Received: from mx1.starman.ee (smtp-out2.starman.ee [85.253.0.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id F196E43D45 for ; Thu, 6 Apr 2006 00:39:42 +0000 (GMT) (envelope-from hadara@bsd.ee) Received: from depression.softematic.com (depression.softematic.com [62.65.205.81]) by mx1.starman.ee (Postfix) with ESMTP id BD09823C15B; Thu, 6 Apr 2006 03:39:39 +0300 (EEST) From: Sven Petai To: freebsd-performance@freebsd.org Date: Thu, 6 Apr 2006 03:39:30 +0300 User-Agent: KMail/1.9.1 References: <20060405134919.T16926@odysseus.silby.com> In-Reply-To: <20060405134919.T16926@odysseus.silby.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604060339.31800.hadara@bsd.ee> X-Virus-Scanned: by Amavisd-New at mx1.starman.ee Cc: Mike Silbersack Subject: Re: mysql tests - one more thing to try X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Apr 2006 00:39:47 -0000 On Wednesday 05 April 2006 21:52, Mike Silbersack wrote: > If you're willing to spend more time looking at this, I suggest that you > run truss or ktrace on the super-smack processes. I did a small amount of > mysql vs postgres vs firebird benchmarking two years ago for a class > project, and noticed that mysql's results were showing the same phenomena > - our test program was using more cpu than mysqld. I run truss on our > test program and found that it was doing ONE BYTE READS from the socket, > rather than something larger. > > I never had the time to see if the problem was fixed at a later time or > not. You may wish to see if that same condition is still happening. here are ktrace results for supersmack and mysqld from a dualcore opteron box running select smack with 100 threads and 10000 queries os: fbsd 6.1 prerelease ==== syscall stats for supersmack ==== number of syscalls captured: 42575687 individual syscalls counts: read 32914539 77.31% gettimeofday 3865508 9.08% fcntl 3862708 9.07% write 1930922 4.54% break 325 0.00% close 301 0.00% setsockopt 202 0.00% select 192 0.00% semop 184 0.00% connect 101 0.00% wait4 101 0.00% socket 101 0.00% getpid 100 0.00% fork 100 0.00% exit 97 0.00% shutdown 96 0.00% mmap 27 0.00% access 15 0.00% mprotect 12 0.00% open 12 0.00% fstat 10 0.00% munmap 8 0.00% execve 7 0.00% ioctl 4 0.00% stat 3 0.00% sigprocmask 2 0.00% issetugid 2 0.00% lseek 1 0.00% sysarch 1 0.00% __sysctl 1 0.00% semget 1 0.00% readlink 1 0.00% sigaction 1 0.00% pipe 1 0.00% __semctl 1 0.00% request sizes for syscall read size count % --------------------------- 4 15489977 47.06% 50 3982797 12.10% 1 3873493 11.77% 52 2654645 8.07% 60 1937753 5.89% 5 1933249 5.87% 8192 1931176 5.87% 53 790179 2.40% 51 274285 0.83% 49 23467 0.07% 48 12943 0.04% 1023 5301 0.02% 47 2543 0.01% 46 2497 0.01% 7 101 0.00% 24 99 0.00% 4096 29 0.00% 128 1 0.00% 6 1 0.00% 17796 1 0.00% 30 1 0.00% 66 1 0.00% request sizes for syscall write size count % --------------------------- 65 1716567 88.90% 64 193387 10.02% 63 20584 1.07% 47 101 0.01% 5 94 0.00% 4 92 0.00% 26 92 0.00% 34 1 0.00% 35 1 0.00% 40 1 0.00% 49 1 0.00% 55 1 0.00% ==== syscall stats for mysqld ==== number of syscalls captured: 20743045 individual syscalls counts: _umtx_op 7657825 36.92% gettimeofday 4308736 20.77% read 4306035 20.76% write 1942882 9.37% sigprocmask 1680724 8.10% fcntl 838247 4.04% thr_kill 2827 0.01% sigwait 2403 0.01% setitimer 1669 0.01% select 698 0.00% setsockopt 202 0.00% access 101 0.00% accept 101 0.00% getsockname 101 0.00% open 101 0.00% shutdown 96 0.00% close 96 0.00% thr_new 92 0.00% thr_exit 87 0.00% break 22 0.00% request sizes for syscall write size count % --------------------------- 304 794935 40.92% 303 720255 37.07% 302 274534 14.13% 301 111278 5.73% 300 23412 1.21% 299 13079 0.67% 297 2615 0.13% 298 2571 0.13% 11 101 0.01% 56 101 0.01% 63 1 0.00% request sizes for syscall read size count % --------------------------- 4 2362079 54.86% 61 1728044 40.13% 60 194721 4.52% 59 20993 0.49% 43 101 0.00% 1 96 0.00% 31 1 0.00% From owner-freebsd-performance@FreeBSD.ORG Thu Apr 6 05:50:43 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 334E216A400 for ; Thu, 6 Apr 2006 05:50:43 +0000 (UTC) (envelope-from silby@silby.com) Received: from relay00.pair.com (relay00.pair.com [209.68.5.9]) by mx1.FreeBSD.org (Postfix) with SMTP id A349143D46 for ; Thu, 6 Apr 2006 05:50:42 +0000 (GMT) (envelope-from silby@silby.com) Received: (qmail 56733 invoked from network); 6 Apr 2006 05:50:41 -0000 Received: from unknown (HELO localhost) (unknown) by unknown with SMTP; 6 Apr 2006 05:50:41 -0000 X-pair-Authenticated: 209.68.2.70 Date: Thu, 6 Apr 2006 00:50:38 -0500 (CDT) From: Mike Silbersack To: Sven Petai In-Reply-To: <200604060339.31800.hadara@bsd.ee> Message-ID: <20060406004606.A25881@odysseus.silby.com> References: <20060405134919.T16926@odysseus.silby.com> <200604060339.31800.hadara@bsd.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-performance@freebsd.org Subject: Re: mysql tests - one more thing to try X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Apr 2006 05:50:43 -0000 On Thu, 6 Apr 2006, Sven Petai wrote: > On Wednesday 05 April 2006 21:52, Mike Silbersack wrote: > >> If you're willing to spend more time looking at this, I suggest that you >> run truss or ktrace on the super-smack processes. I did a small amount of >> mysql vs postgres vs firebird benchmarking two years ago for a class >> project, and noticed that mysql's results were showing the same phenomena >> - our test program was using more cpu than mysqld. I run truss on our >> test program and found that it was doing ONE BYTE READS from the socket, >> rather than something larger. >> >> I never had the time to see if the problem was fixed at a later time or >> not. You may wish to see if that same condition is still happening. > > here are ktrace results for supersmack and mysqld from a dualcore opteron box > running select smack with 100 threads and 10000 queries > os: fbsd 6.1 prerelease > > ==== syscall stats for supersmack ==== > > request sizes for syscall read > size count % > --------------------------- > 4 15489977 47.06% > 50 3982797 12.10% > 1 3873493 11.77% > 52 2654645 8.07% > 60 1937753 5.89% > 5 1933249 5.87% > 8192 1931176 5.87% > 53 790179 2.40% > 51 274285 0.83% Thanks for running those tests, Sven. It looks like the problem still exists. :( I wish I had time to work on this... Mike "Silby" Silbersack From owner-freebsd-performance@FreeBSD.ORG Thu Apr 6 07:01:34 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E3EBC16A422 for ; Thu, 6 Apr 2006 07:01:34 +0000 (UTC) (envelope-from joseph.koshy@gmail.com) Received: from xproxy.gmail.com (xproxy.gmail.com [66.249.82.198]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5FAD143D48 for ; Thu, 6 Apr 2006 07:01:34 +0000 (GMT) (envelope-from joseph.koshy@gmail.com) Received: by xproxy.gmail.com with SMTP id s9so60404wxc for ; Thu, 06 Apr 2006 00:01:33 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=oWxpUxVDur/s5Lx75rix8qGYs2WWdhyckxs50tcgNc6tB3jOqyB19uKeXxCWRNigq0ipfx1VycYdNBgCosVVbeOUREy8TPeV3swWhLVw1QpMdWavu9mrEkD4kpmcDdFmcAiGBjhYjY3lveJJhqcgG4CjHl+LYVjMHhz+cRwZxZg= Received: by 10.70.76.18 with SMTP id y18mr786688wxa; Thu, 06 Apr 2006 00:01:33 -0700 (PDT) Received: by 10.70.117.3 with HTTP; Thu, 6 Apr 2006 00:01:33 -0700 (PDT) Message-ID: <84dead720604060001k38cef1f3p7fbb13e3e6c3c662@mail.gmail.com> Date: Thu, 6 Apr 2006 12:31:33 +0530 From: "Joseph Koshy" To: "Sven Petai" In-Reply-To: <200604060339.31800.hadara@bsd.ee> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <20060405134919.T16926@odysseus.silby.com> <200604060339.31800.hadara@bsd.ee> Cc: freebsd-performance@freebsd.org, Ganbold , Mike Silbersack Subject: Re: mysql tests - one more thing to try X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Apr 2006 07:01:35 -0000 > here are ktrace results for supersmack and mysqld from a dualcore opteron= box > running select smack with 100 threads and 10000 queries > os: fbsd 6.1 prerelease > > =3D=3D=3D=3D syscall stats for supersmack =3D=3D=3D=3D > number of syscalls captured: 42575687 > individual syscalls counts: > read 32914539 77.31% > gettimeofday 3865508 9.08% > fcntl 3862708 9.07% > write 1930922 4.54% > break 325 0.00% > close 301 0.00% > setsockopt 202 0.00% > select 192 0.00% > semop 184 0.00% > connect 101 0.00% > wait4 101 0.00% > socket 101 0.00% > getpid 100 0.00% > fork 100 0.00% > exit 97 0.00% > shutdown 96 0.00% > mmap 27 0.00% > access 15 0.00% > mprotect 12 0.00% > open 12 0.00% > fstat 10 0.00% > munmap 8 0.00% > execve 7 0.00% > ioctl 4 0.00% > stat 3 0.00% > sigprocmask 2 0.00% > issetugid 2 0.00% > lseek 1 0.00% > sysarch 1 0.00% > __sysctl 1 0.00% > semget 1 0.00% > readlink 1 0.00% > sigaction 1 0.00% > pipe 1 0.00% > __semctl 1 0.00% > > request sizes for syscall read > size count % > --------------------------- > 4 15489977 47.06% > 50 3982797 12.10% > 1 3873493 11.77% > 52 2654645 8.07% > 60 1937753 5.89% > 5 1933249 5.87% > 8192 1931176 5.87% > 53 790179 2.40% > 51 274285 0.83% > 49 23467 0.07% > 48 12943 0.04% > 1023 5301 0.02% > 47 2543 0.01% > 46 2497 0.01% > 7 101 0.00% > 24 99 0.00% > 4096 29 0.00% > 128 1 0.00% > 6 1 0.00% > 17796 1 0.00% > 30 1 0.00% > 66 1 0.00% > > request sizes for syscall write > size count % > --------------------------- > 65 1716567 88.90% > 64 193387 10.02% > 63 20584 1.07% > 47 101 0.01% > 5 94 0.00% > 4 92 0.00% > 26 92 0.00% > 34 1 0.00% > 35 1 0.00% > 40 1 0.00% > 49 1 0.00% > 55 1 0.00% > > =3D=3D=3D=3D syscall stats for mysqld =3D=3D=3D=3D > > number of syscalls captured: 20743045 > individual syscalls counts: > _umtx_op 7657825 36.92% > gettimeofday 4308736 20.77% > read 4306035 20.76% > write 1942882 9.37% > sigprocmask 1680724 8.10% > fcntl 838247 4.04% > thr_kill 2827 0.01% > sigwait 2403 0.01% > setitimer 1669 0.01% > select 698 0.00% > setsockopt 202 0.00% > access 101 0.00% > accept 101 0.00% > getsockname 101 0.00% > open 101 0.00% > shutdown 96 0.00% > close 96 0.00% > thr_new 92 0.00% > thr_exit 87 0.00% > break 22 0.00% > > request sizes for syscall write > size count % > --------------------------- > 304 794935 40.92% > 303 720255 37.07% > 302 274534 14.13% > 301 111278 5.73% > 300 23412 1.21% > 299 13079 0.67% > 297 2615 0.13% > 298 2571 0.13% > 11 101 0.01% > 56 101 0.01% > 63 1 0.00% > request sizes for syscall read > size count % > --------------------------- > 4 2362079 54.86% > 61 1728044 40.13% > 60 194721 4.52% > 59 20993 0.49% > 43 101 0.00% > 1 96 0.00% > 31 1 0.00% Ganbold was kind enough to run a hwpmc profiling session while running super-smack. The kernel profile can be seen here: http://www.mnbsd.org/ftp/kernel_gprof.txt Excerpt: granularity: each sample hit covers 4 byte(s) for 0.00% of 111655.00 secon= ds time is in ticks, not seconds % cumulative self self total time seconds seconds calls ms/call ms/call name 6.0 6664.00 6664.00 0 100.00% syscall [1] 4.1 11215.00 4551.00 0 100.00% soreceive [2] 4.1 15743.00 4528.00 0 100.00% binuptime [3] 3.1 19217.00 3474.00 0 100.00% generic_copyout [4] 2.7 22230.00 3013.00 0 100.00% generic_copyin [5] The profile for libmysql (http://www.mnbsd.org/ftp/libmysql_gprof.txt) looked like: time is in ticks, not seconds granularity: each sample hit covers 4 byte(s) for 0.00% of 23377.00 second= s % cumulative self self total time seconds seconds calls ms/call ms/call name 12.2 2850.00 2850.00 0 100.00% MD5Transform [1] 11.3 5503.00 2653.00 0 100.00% sha1_result [2] 9.6 7752.00 2249.00 0 100.00% my_once_alloc [3] 7.7 9561.00 1809.00 0 100.00% my_compress [4] 6.8 11150.00 1589.00 0 100.00% load_defaults [5] 3.2 11904.00 754.00 0 100.00% my_once_strdup [6] These profiles are consistent with there being lots of short-lived system calls. Due to a shortcoming in the profiling toolset, Ganbold's run could not capture what mysqld was doing. -- FreeBSD Volunteer, http://people.freebsd.org/~jkoshy From owner-freebsd-performance@FreeBSD.ORG Thu Apr 6 09:12:43 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC72A16A400 for ; Thu, 6 Apr 2006 09:12:43 +0000 (UTC) (envelope-from mv@roq.com) Received: from p4.roq.com (ns1.ecoms.com [207.44.130.137]) by mx1.FreeBSD.org (Postfix) with ESMTP id 70E2F43D45 for ; Thu, 6 Apr 2006 09:12:43 +0000 (GMT) (envelope-from mv@roq.com) Received: from p4.roq.com (localhost.roq.com [127.0.0.1]) by p4.roq.com (Postfix) with ESMTP id 27B7F4CD6C; Thu, 6 Apr 2006 09:13:20 +0000 (GMT) Received: from [192.168.46.101] (ppp166-27.static.internode.on.net [150.101.166.27]) by p4.roq.com (Postfix) with ESMTP id 1473E4CD63; Thu, 6 Apr 2006 09:13:17 +0000 (GMT) Message-ID: <4434DB85.10104@roq.com> Date: Thu, 06 Apr 2006 19:12:37 +1000 From: Michael Vince User-Agent: Thunderbird 1.5 (X11/20060216) MIME-Version: 1.0 To: Steven Hartland References: <200604041942.18767.hadara@bsd.ee> <021b01c658d2$de254a00$b3db87d4@multiplay.co.uk> In-Reply-To: <021b01c658d2$de254a00$b3db87d4@multiplay.co.uk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP Cc: Sven Petai , freebsd-performance@freebsd.org Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Apr 2006 09:12:43 -0000 I just ran a test on 6_stable (April 5th) on a Dell 2850 dual CPU (single core 3.60GHz) using the AMD64 build of FreeBSD and got similar speeds as you. Its interesting how Sven could have 8 cores with what appears to be less MySQL speed then just having a few cores. After enabling libthr it does jump by about 3,600 on a generic SMP kernel compile, I didn't try any more serious tweaks. For those who are interested in exactly how I tested wheres what I did. portupgrade -RN -m 'BUILD_OPTIMIZED=yes WITH_PROC_SCOPE_PTH=yes' /usr/ports/databases/mysql41-server portupgrade -RN /usr/ports/benchmarks/super-smack super-smack -d mysql /usr/local/share/super-smack/select-key.smack 10 10000 Query Barrel Report for client smacker1 connect: max=4ms min=1ms avg= 2ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 200000 0 0 22061.88 With this below in my /etc/libmap.conf for libthr and a MySQL restart /usr/local/etc/rc.d/mysql-server restart the numbers do jump. [/usr/local/libexec/mysqld] libpthread.so.2 libthr.so.2 libpthread.so libthr.so super-smack -d mysql /usr/local/share/super-smack/select-key.smack 10 10000 Query Barrel Report for client smacker1 connect: max=238ms min=0ms avg= 117ms from 10 clients Query_type num_queries max_time min_time q_per_s select_index 200000 0 0 25601.49 I have also done benchmarking with libthr against Apache using 'ab' and found it can deliver an extra amount of megabytes/sec of data (I think it was about an extra 2000/requests sec) at the cost of giving the server from what I remember almost double the 'average load' according to 'top' Given that if your machine has nothing else to do but deliver data purely from Apache then even libthr is more worth while for Apache as well. Mike Steven Hartland wrote: > Looking at this on a dual box here ( waiting for the new MB for dual > dual core ) > All the time is spent processing super-smack and only 25% on mysqld. > Even dropping to 10 clients a large portion is take by the clients. > That said there is a lot that can be gained by using the tweaks out there > i.e. ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + > cpu_acct_2.patch > Adding these jumps from a baseline: > select_index 2000000 8 0 18624.60 > to: > select_index 2000000 5 0 29942.10 > > The biggest increases coming from libthr ( thanks DavidXu ) and the ULE > scheduler. > > [log] > == 4BSD + libpthread + ACPI-Fast == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=46ms min=6ms avg= 25ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 8 0 18624.60 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=5ms min=0ms avg= 1ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 23983.87 > > == 4BSD + libthr + ACPI-Fast == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=107ms min=2ms avg= 45ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 13 0 22413.39 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=2ms min=1ms avg= 1ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 26841.07 > > == 4BSD + libthr + TSC == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=46ms min=1ms avg= 21ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 11 0 23428.03 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=2ms min=0ms avg= 1ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 26403.95 > > == ULE + libthr + TSC == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=41ms min=0ms avg= 23ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 5 0 28581.18 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=4ms min=0ms avg= 1ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 30128.44 > > == ULE + libthr + TSC + context_time.patch + cpu_acct_1.patch + > cpu_acct_2.patch == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=27ms min=0ms avg= 14ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 5 0 29942.10 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=12ms min=0ms avg= 4ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 31057.52 > > == 4BSD + libthr + TSC + context_time.patch + cpu_acct_1.patch + > cpu_acct_2.patch == > super-smack -d mysql select-key.smack 100 10000 > Query Barrel Report for client smacker1 > connect: max=54ms min=20ms avg= 38ms from 100 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 9 0 24144.22 > > super-smack -d mysql select-key.smack 10 100000 > Query Barrel Report for client smacker1 > connect: max=2ms min=0ms avg= 1ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 2000000 0 0 27073.46 > > ** update test ** > super-smack -d mysql update-select.smack 10 100000 > Query Barrel Report for client smacker > connect: max=3ms min=0ms avg= 0ms from 10 clients Query_type > num_queries max_time min_time q_per_s > select_index 1000000 1 0 6468.70 > update_index 1000000 0 0 6468.70 > [/log] > > Machine: > Dual 244, 2Gb running FreeBSD 6.1-PRERELEASE (i386) > Package install of mysql 4.0 > Port install of super-smack > > Notes: > No detectable disk activity thoughout the tests > ULE scheduler breaks the output from top with everything showing as > WCPU 0% in the 100 concurrency test and the numbers not adding up > at all in 10 concurrency test or showing 0%. > To get context_time.patch to work I needed the attached patch which > is basically two failed chunks of: kern/kern_exit.c moved to > kern/kern_thread.c > > Steve > ----- Original Message ----- From: "Sven Petai" > To: > Sent: Tuesday, April 04, 2006 5:42 PM > Subject: mysql performance on 4 * dualcore opteron > > >> hi >> >> Before I begin, let me just say that I'm probably aware most of the >> threads about mysql performance in various fbsd lists over last >> couple of years, so please let's not consentrate on the usual points >> made over and over again like how filesystems are mounted under >> linux, how fast time() is or how various combinations of >> scheduler/threding library/compiler flags give you ~5-10% better >> performance. It's very unlikely that any of these reasons, or even >> all of them together can explain performance differences of 2-3 * >> so now a little bit of the backround... >> I usually use MySQL benchmark called super-smack as one of the >> benchmarks on all the new machines to get a general feeling of the >> servers performance. >> I certainly agree that the default smack workloads are far too simple >> to say much about actual production performance, but still... better >> than nothing... >> >> In general 2.4Ghz amd64 UP box (6.1 betaX) can do about >> 17400 q/s with select-smack+4bsd+thr combination and >> 4300 q/s with update-smack+4bsd+thr >> >> on dualcore 2Ghz opteron (6.1 prerelease) the results are: >> 20000 q/s with select-smack+4bsd+thr and >> 4500 q/s with update-smack+4bsd+thr >> >> performance for update-smack seems to be always 4XXX q/s, no matter >> how many CPUs the box has or what kind or raid controller/disks are >> used (i have tested on about 8 rather different machines). I have no >> idea if IO on all the servers I have tried really maxes out at this >> point or is there some bottleneck in UFS. >> select-smack performance gains on dualcore are not quite as good as >> one might expect, but then again that dualcore box uses ECC memory >> which is probably somewhat slower because of the checksum >> calculations, and synchronisation has some overhead too... Anyway all >> in all I'm more or less happy with these results, even though linux >> will do about twise as much selects on the same hardware. >> >> Today I had a chance to test 4 * 2Ghz dualcore opteron machine, so >> this machine has 8 cores in total and 8G of RAM. >> >> Now, on that server I get: >> 11000 q/s for select-smack+4bsd+thr combination (with KSE it's around >> 6000 q/s, ule+thr gives somewhere around 12000 q/s) >> 4100 q/s for update-smack+4bsd+thr >> >> So the 8 core machine got almost 2* worse result for select than UP >> server. >> >> After some tinkering I found out that renicing mysqld to -5 will make >> it push out 21000 q/s (4bsd, thr), so I suspect part of the problem >> is in the scheduling - probably super-smack with it's 100 processes >> gets just a lot more CPU time otherwise than mysql with it's 100 >> threads servicing them. But anyway even this result is still only >> about equal in performance to what I get from dualcore machine. >> >> As I ran out of good (macro)tuning ideas at this point, and wanted to >> make sure higher scores are indeed achievable, I tried Linux on the >> same hardware. >> Here are the results for same tests on Suse enterprise linux 9 >> (2.6.5-7.97-smp): >> 76857 q/s for select-smack >> 10050 q/s for update-smack >> >> the mysql configuration was identical to the one I used under freebsd >> (my-huge). This Suse uses ReiserFS, but I have no idea about what >> kind of FS guarantees it provides, didn't see any sync/async stuff in >> the mount output. >> I also repeated the tests on identical box that had Fedora installed >> (2.6.9-22-ELsmp) and used ext3'fs. >> select-smack results were obviously almost the same as it doesn't >> touch the FS, update was about 8000 q/s. >> >> I'm relativelly sure that this kind of huge performance differences >> can't be explained by mere speed difference of time(), I haven't yet >> tested phk'd and roberts timer hacks, but at some point in time I >> rewrote mysql's timing code to completelly avoid any calls to time() >> by keeping internal timestamp that was updated from TSC reg. value. >> It was certainly very ugly and imprecise, but worked well enough >> since mysql uses these code paths mainly for statistics and for >> setting various safeguard timeouts. Even with ~90% time() calls >> removed the performance still didn't get measurably better. >> Of course it's possible that I fucked up somehow, so if someone has >> tested roberts and phk's changes then it would be certainly nice to >> hear about your results. >> >> To make the long story short - does anyone have any good ideas about >> where might the bottleneck and how to debug it ? >> >> PS >> Here's some system/test information: >> super-smack was used with concurrency of 100 and reqs. set to 10000 >> it was running on the same machine as the mysqld and connections were >> done over local socket. >> >> timer: acpi-fast in all the cases >> mysql: 4.1.18_2 from ports, table type is myisam >> mysql configuration file: >> http://bsd.ee/~hadara/debug/mysql3/2way/my.cnf >> in general it's just my-huge.cnf from mysql distribution, with >> increased max_connections >> >> kernel config is GENERIC-SMP (no it doesn't have WITNESS enabled) >> == 4 * dualcore opteron ==: >> vmstat 1, during select-smack test: >> http://bsd.ee/~hadara/debug/mysql3/8way/vmstat.txt >> dmesg: >> http://bsd.ee/~hadara/debug/mysql3/8way/dmesg.boot >> sysctl -a: >> http://bsd.ee/~hadara/debug/mysql3/8way/sysctl.txt >> >> == 1 * dualcore opteron ==: >> vmstat 1, during select-smack test: >> http://bsd.ee/~hadara/debug/mysql3/2way/vmstat.txt >> dmesg: >> http://bsd.ee/~hadara/debug/mysql3/2way/dmesg.boot >> sysctl -a: >> http://bsd.ee/~hadara/debug/mysql3/2way/sysctl.txt >> _______________________________________________ >> freebsd-performance@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-performance >> To unsubscribe, send any mail to >> "freebsd-performance-unsubscribe@freebsd.org" >> >> > > ================================================ > This e.mail is private and confidential between Multiplay (UK) Ltd. > and the person or entity to whom it is addressed. In the event of > misdirection, the recipient is prohibited from using, copying, > printing or otherwise disseminating it or any information contained in > it. > In the event of misdirection, illegible or incomplete transmission > please telephone (023) 8024 3137 > or return the E.mail to postmaster@multiplay.co.uk. > ------------------------------------------------------------------------ > > _______________________________________________ > freebsd-performance@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-performance > To unsubscribe, send any mail to "freebsd-performance-unsubscribe@freebsd.org" From owner-freebsd-performance@FreeBSD.ORG Sat Apr 8 09:44:41 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 833F316A400 for ; Sat, 8 Apr 2006 09:44:41 +0000 (UTC) (envelope-from mv@roq.com) Received: from p4.roq.com (ns1.ecoms.com [207.44.130.137]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BA1D43D4C for ; Sat, 8 Apr 2006 09:44:40 +0000 (GMT) (envelope-from mv@roq.com) Received: from p4.roq.com (localhost.roq.com [127.0.0.1]) by p4.roq.com (Postfix) with ESMTP id ACDA14CD18; Sat, 8 Apr 2006 09:45:22 +0000 (GMT) Received: from [192.168.0.6] (ppp157-158.static.internode.on.net [150.101.157.158]) by p4.roq.com (Postfix) with ESMTP id 4ADC44CCCC; Sat, 8 Apr 2006 09:45:21 +0000 (GMT) Message-ID: <44378600.7010004@roq.com> Date: Sat, 08 Apr 2006 19:44:32 +1000 From: Michael Vince User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.12) Gecko/20060213 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Xu References: <200604041942.18767.hadara@bsd.ee> <021b01c658d2$de254a00$b3db87d4@multiplay.co.uk> <4434DB85.10104@roq.com> <200604081315.29842.yfxu@corp.netease.com> In-Reply-To: <200604081315.29842.yfxu@corp.netease.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP Cc: Sven Petai , freebsd-performance@freebsd.org, Steven Hartland Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Apr 2006 09:44:41 -0000 David Xu wrote: >ÔÚ Thursday 06 April 2006 17:12£¬Michael Vince дµÀ£º > > > >>I have also done benchmarking with libthr against Apache using 'ab' and >>found it can deliver an extra amount of megabytes/sec of data (I think >>it was about an extra 2000/requests sec) at the cost of giving the >>server from what I remember almost double the 'average load' according >>to 'top' >>Given that if your machine has nothing else to do but deliver data >>purely from Apache then even libthr is more worth while for Apache as well. >> >>Mike >> >> > >libpthread default uses M:N threads which means a thread may be on >userland scheduler's run queue, and FreeBSD kernel does not know, >so it will be not shown on average load, default system tools are not >very useful here. > >David Xu > > Yeah I figured that, which isn't fair because it makes pthread look better then libthr when its not. I did notice that during the tests that when benchmarking Apache under libthr that when it was giving out the extra megabytes/sec speed that the server did feel equally responsive even though 'top' was reporting higher load. I have also tried putting my Perl under libthr for a single thread log analyzer and to my surprise it even could process logs faster. libthr is also really useful for actually paying attention to tops 'thr' column since it does show actual true thread number activity, under pthread it shows a couple and under libc_r I could have 1000 threads going but top just shows 1. Mike From owner-freebsd-performance@FreeBSD.ORG Sat Apr 8 11:02:15 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 821F816A401 for ; Sat, 8 Apr 2006 11:02:15 +0000 (UTC) (envelope-from hadara@bsd.ee) Received: from mx1.starman.ee (smtp-out2.starman.ee [85.253.0.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1F1F943D46 for ; Sat, 8 Apr 2006 11:02:14 +0000 (GMT) (envelope-from hadara@bsd.ee) Received: from depression.softematic.com (depression.softematic.com [62.65.205.81]) by mx1.starman.ee (Postfix) with ESMTP id 4301F23C1AA; Sat, 8 Apr 2006 14:02:10 +0300 (EEST) From: Sven Petai To: David Xu Date: Sat, 8 Apr 2006 14:02:00 +0300 User-Agent: KMail/1.9.1 References: <200604041942.18767.hadara@bsd.ee> <44378600.7010004@roq.com> <200604081832.46971.yfxu@corp.netease.com> In-Reply-To: <200604081832.46971.yfxu@corp.netease.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604081402.00982.hadara@bsd.ee> X-Virus-Scanned: by Amavisd-New at mx1.starman.ee Cc: Michael Vince , freebsd-performance@freebsd.org, Steven Hartland Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Apr 2006 11:02:15 -0000 On Saturday 08 April 2006 13:32, David Xu wrote: > On Saturday 08 April 2006 17:44, Michael Vince wrote: > > I have also tried putting my Perl under libthr for a single thread log > > analyzer and to my surprise it even could process logs faster. > > I don't know why, but I only know I did some micro optimizations in libthr, > and the library is small and may be fully cached in L1 cache on athlon > xp/64 CPU, don't take it seriously. ;-) > > > libthr is also really useful for actually paying attention to tops 'thr' > > column since it does show actual true thread number activity, under > > pthread it shows a couple and under libc_r I could have 1000 threads > > going but top just shows 1. Which makes me wonder if anyone has seen any realistic workload type under which kse library would outperform libthr ? My experiences have always been pretty much the same as yours - everything seems to be faster with libthr and in the case of mysql, even more stable. So shouldn't libthr be made the default one instead of libkse ? From owner-freebsd-performance@FreeBSD.ORG Sat Apr 8 05:15:46 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 4E0A216A405; Sat, 8 Apr 2006 05:15:46 +0000 (UTC) (envelope-from yfxu@corp.netease.com) From: David Xu Organization: netease.com To: freebsd-performance@freebsd.org Date: Sat, 8 Apr 2006 13:15:29 +0800 User-Agent: KMail/1.8.2 References: <200604041942.18767.hadara@bsd.ee> <021b01c658d2$de254a00$b3db87d4@multiplay.co.uk> <4434DB85.10104@roq.com> In-Reply-To: <4434DB85.10104@roq.com> MIME-Version: 1.0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Message-Id: <200604081315.29842.yfxu@corp.netease.com> X-Mailman-Approved-At: Sat, 08 Apr 2006 11:27:36 +0000 Cc: Michael Vince , Sven Petai , Steven Hartland Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Apr 2006 05:15:46 -0000 =D4=DA Thursday 06 April 2006 17:12=A3=ACMichael Vince =D0=B4=B5=C0=A3=BA > I have also done benchmarking with libthr against Apache using 'ab' and=20 > found it can deliver an extra amount of megabytes/sec of data (I think=20 > it was about an extra 2000/requests sec) at the cost of giving the=20 > server from what I remember almost double the 'average load' according=20 > to 'top' > Given that if your machine has nothing else to do but deliver data=20 > purely from Apache then even libthr is more worth while for Apache as wel= l. >=20 > Mike libpthread default uses M:N threads which means a thread may be on userland scheduler's run queue, and FreeBSD kernel does not know,=20 so it will be not shown on average load, default system tools are not very useful here. David Xu From owner-freebsd-performance@FreeBSD.ORG Sat Apr 8 10:33:03 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 9CA9E16A400; Sat, 8 Apr 2006 10:33:03 +0000 (UTC) (envelope-from yfxu@corp.netease.com) From: David Xu Organization: netease.com To: Michael Vince Date: Sat, 8 Apr 2006 18:32:46 +0800 User-Agent: KMail/1.8.2 References: <200604041942.18767.hadara@bsd.ee> <200604081315.29842.yfxu@corp.netease.com> <44378600.7010004@roq.com> In-Reply-To: <44378600.7010004@roq.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200604081832.46971.yfxu@corp.netease.com> X-Mailman-Approved-At: Sat, 08 Apr 2006 11:34:47 +0000 Cc: Sven Petai , freebsd-performance@freebsd.org, Steven Hartland Subject: Re: mysql performance on 4 * dualcore opteron X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Apr 2006 10:33:04 -0000 On Saturday 08 April 2006 17:44, Michael Vince wrote: > I have also tried putting my Perl under libthr for a single thread log > analyzer and to my surprise it even could process logs faster. > I don't know why, but I only know I did some micro optimizations in libthr, and the library is small and may be fully cached in L1 cache on athlon xp/64 CPU, don't take it seriously. ;-) > libthr is also really useful for actually paying attention to tops 'thr' > column since it does show actual true thread number activity, under > pthread it shows a couple and under libc_r I could have 1000 threads > going but top just shows 1. > > Mike