From owner-freebsd-threads@FreeBSD.ORG Sun May 23 22:08:36 2004 Return-Path: Delivered-To: freebsd-threads@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 847CC16A4CE for ; Sun, 23 May 2004 22:08:36 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id D59CF43D45 for ; Sun, 23 May 2004 22:08:35 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.11/8.12.11) with ESMTP id i4O57mO7044212; Mon, 24 May 2004 01:07:48 -0400 (EDT) (envelope-from robert@fledge.watson.org) Received: from localhost (robert@localhost)i4O57mwl044169; Mon, 24 May 2004 01:07:48 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Mon, 24 May 2004 01:07:48 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Petri Helenius In-Reply-To: <40B1053F.6080604@he.iki.fi> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE cc: Julian Elischer cc: freebsd-threads@freebsd.org Subject: Re: Why is MySQL nearly twice as fast on Linux? X-BeenThere: freebsd-threads@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Threading on FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 May 2004 05:08:36 -0000 On Sun, 23 May 2004, Petri Helenius wrote: > >There is obviously a bottleneck, but it's very hard to tell what it is.. > >My guess is that the scheduler(s) are not doing a very good job. and the > >fact that GIANT is not removed from the kernel yet says that generally > >syscalls will be a bottleneck. > > > While watching the top output, I saw a "logjam" to appear from time to=20 > time where all processes/threads were waiting for Giant. However I don=B4= t=20 > feel that causes the large impact, it might contribute 10-20% but it=20 > does not feel frequent enough to cause 50% difference. top is a little misleading because it has to acquire Giant in order to check the status of the other processes. This increases the chance of Giant contention. There are at least a few things going on here. Among various results, I saw that switching to a UP kernel improved performance, but not nearly enough. This suggests lock contention is not the cause of the problem. If you want to investigate lock contention, there are a couple of things you might try: (1) Compile the kernel with MUTEX_PROFILING -- it has two contention measurement fields that can help track contention. Note that running with mutex profiling will dramatically hurt performance, but might still be quite informative.=20 (2) It might be interesting to run with the netperf patches, as they should greatly reduce contention for local UNIX domain socket I/O. I haven't tried any benchmarking with MySQL, but it might be worth a try. You can find information on the ongoing work at:=20 =09http://www.watson.org/~robert/freebsd/netperf/ The work is moving fairly fast, as I'm working on tracking down additional socket nits, but it could help. > >ULE should be able to do a better job at scheduling with > >multiple CPUs but it is a work in progress. If threads all hit a GIANT= =20 > >based logjam, there is not a lot the scheduler can do about it.. > > > I find it hard to believe that the threading stuff would be seriously=20 > broken since we do large processing with libkse and don=B4t have issues= =20 > with the performance. However I=B4m observing about 50000 context switche= s=20 > but only 5000 syscalls a second. (I know it=B4s a different application= =20 > but also for 1500 queries a second 70000 syscalls sounds excessive). ULE has some sort of known load balancing problem between multiple CPUs -- I've observed it on some local benchmarking with ubench, at least a month or so ago. It seemed to provide highly busy processes derived from the same process tree from migrating properly. SCHED_4BSD did not have this problem. Since we've seen results suggesting changing to SCHED_4BSD didn't help all that much easier, it's still likely not to be the cause. A few months ago I did some work to optimize system call cost a bit -- we had some extra mutex operations. It might be interesting to use ktrace or truss to generate a profile of the system call mix in use, perhaps that would give some informative results about things to look at. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Senior Research Scientist, McAfee Research