From owner-freebsd-stable@FreeBSD.ORG Wed Jul 7 12:17:31 2004 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 098B616A4CE for ; Wed, 7 Jul 2004 12:17:31 +0000 (GMT) Received: from seadragon.mweb.co.za (seadragon.mweb.co.za [196.2.45.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id A5B8843D46 for ; Wed, 7 Jul 2004 12:17:29 +0000 (GMT) (envelope-from jaco@coocoo.za.net) Received: by seadragon.mweb.co.za (Postfix, from userid 501) id 68D3864135; Wed, 7 Jul 2004 14:17:27 +0200 (SAST) Received: from firewall.symphiano (rdg-dial-196-30-234-135.mweb.co.za [196.30.234.135]) by seadragon.mweb.co.za (Postfix) with SMTP id CCA3264112 for ; Wed, 7 Jul 2004 14:17:20 +0200 (SAST) Received: (qmail 27461 invoked from network); 7 Jul 2004 12:12:27 -0000 Received: from unknown (HELO ?192.168.53.102?) (192.168.53.102) by firewall.symphiano with SMTP; 7 Jul 2004 12:12:27 -0000 Message-ID: <40EBE9CF.50001@coocoo.za.net> Date: Wed, 07 Jul 2004 14:17:19 +0200 From: Jaco van Tonder Organization: Private User-Agent: Mozilla Thunderbird 0.6 (Windows/20040502) X-Accept-Language: en-us, en MIME-Version: 1.0 To: stable@freebsd.org References: <20040706193151.K6269@shell.inch.com> In-Reply-To: <20040706193151.K6269@shell.inch.com> X-Enigmail-Version: 0.84.0.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on seadragon.mweb.co.za X-Spam-Level: X-Spam-Status: No, hits=0.6 required=7.5 tests=TW_DP,TW_DV,TW_MV,TW_PD,TW_PG, TW_QL,TW_VP,TW_ZF X-Spam-Report: * 0.1 TW_MV BODY: Odd Letter Triples with MV * 0.1 TW_ZF BODY: Odd Letter Triples with ZF * 0.1 TW_PG BODY: Odd Letter Triples with PG * 0.1 TW_DV BODY: Odd Letter Triples with DV * 0.1 TW_DP BODY: Odd Letter Triples with DP * 0.1 TW_VP BODY: Odd Letter Triples with VP * 0.1 TW_QL BODY: Odd Letter Triples with QL * 0.1 TW_PD BODY: Odd Letter Triples with PD Subject: Re: mysql slowdown under load X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: jaco@coocoo.za.net List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jul 2004 12:17:31 -0000 Charles Sprickman typed on a keyboard not too far away, on 7/7/2004 1:56 AM: >Hi, > >I've been trying to track down this problem for some time now. Since I >have a temporary fix in place, I can breathe a bit and hope someone here >has some answers (questions are good too!). > >I have a dual Athlon 1600MP box running 4.8-p3. It's a mail server >running qmail with vpopmail on top of that, backed by mysql for storing >user info (user/pass/homedir, etc.). Webmail is currently served off of >the box as well (waiting for a vpopmail upgrade to peel this service off >the box). > >In the past week we've had complaints from users about having to re-enter >their POP passwords, some bounces of legit mail (chkuser patch on >qmail-smtpd gives a 550 if it can't do a lookup, blech), and password >failures on webmail. After some digging, I was able to figure out that >this all came down to mysql not returning an answer fast enough. > > All I can say is that I have the exact same problem with the exact same software. Also, this machine's load rarely goes above 2.00 (ie, the only time it does is when I build kernel or world). This machine also NEVER swaps. >The box generally runs a load of about .70, peaks at 3-4 during heavy spam >runs. Each mail delivery can theoretically generate about 6 sql queries >(don't ask - I wouldn't likely do vpopmail again if I had a choice). >There are about 3,000 active mail users on here. Even at high loads, the >box is not swapping at all, and interactive response such as webmail and >normal IMAP sessions is still excellent. But the higher the load goes, >the more problems mysql seems to have. The slow query log grows, even >simple db "pings" can take up to 45 seconds to complete. > >Mysql is built from ports, so far I've had these versions running: > >4.0.18 w/LinuxThreads >4.0.20 w/LinuxThreads >4.0.20 w/native threads > >There was no difference going from .18 to .20, but going from LinuxThreads >to system threads did bring a bit of relief. > >My vague hypothesis here (based on past work on larger scale qmail boxes) >is that at some point the kernel just gets a bit "lost" as to what >processes get prioritized; qmail with it's "hand the mail from process to >process" architecture causes lots of processes to be spawned then >destroyed. It seems to me that the system gets so busy with spawning all >of this that mysql languishes. > >Looking at iostat, vmstat and systat, it seems like the number of syscalls >is very high, as is the "context switches" number. However memory is fine >(no paging in/out to disk), cpu is fine, disk i/o is fairly low >(underlying storage is 4 disks on an adaptec raid controller, raid 1+0), >but mysql continues to "stall". Mysqladmin and some mrtg graphing shows >that I'm doing about 15 queries/second peak (which seems pretty low >compared to past work I've done with mysql). > >I saw a previous thread about this problem that never really came to any >good conclusions; the tester cron'd a script that just connected to mysql, >ran a simple query and recorded whether the connection failed or not and >how long it took. Even under very low load connects would timeout or take >a long time to complete. > >My band-aid is to replicate the db to another box and use that for >vpopmail for all non-update/insert queries. So far, so good, but my >curiousity has the better of me on this; it seems like I should be able to >run a local mysql server with such a low Q/S rate without issue. > > I too would like to know why this happens. >This could also be fixed in vpopmail; the design there is not very robust >in the face of db failures. > > Apparently there is a fix in the latest version of Vpopmail, but I had a look at the code and they basically just moved the order in which the connection to the database is made. So, no real fix here. You will just not get the "Unable to create database" message anymore. :( >Here's a few stats, perhaps something jumps out at someone. And if >anyone can explain exactly what the high level of syscalls indicates (and >how much is too much) and what "context switches" are and how they figure >in, that would be great. > >iostat: > > tty da0 md0 cpu > tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id > -0 -86 0.00 0 0.00 0.00 0 0.00 -91 -3-108 -5308 > 0 30 12.00 50 0.58 0.00 0 0.00 2 0 6 0 92 > 0 30 10.64 90 0.94 0.00 0 0.00 5 0 6 0 89 > 0 30 8.44 45 0.37 0.00 0 0.00 3 0 7 0 90 > 0 30 9.67 60 0.57 0.00 0 0.00 3 0 7 0 90 > 0 30 11.68 37 0.42 0.00 0 0.00 1 0 3 0 95 > 0 30 0.00 0 0.00 0.00 0 0.00 3 0 2 0 95 > 0 30 10.41 37 0.37 0.00 0 0.00 4 0 8 0 87 > 0 30 11.78 9 0.10 0.00 0 0.00 0 0 3 0 97 > 0 30 5.00 1 0.00 0.00 0 0.00 1 0 1 0 98 > 0 30 8.23 461 3.71 0.00 0 0.00 1 0 5 0 94 > >vmstat (sorry about the wrap - "sy" and "cs" concern me here): > > procs memory page disks faults cpu > r b w avm fre flt re pi po fr sr da0 md0 in sy cs us sy >id > 1 12 0 800380 43788 135 0 0 0 206 21 0 0 159 72 187 -94 >-114 308 > 1 12 0 795304 45700 1656 0 0 0 1919 0 75 0 509 8289 533 3 >8 89 > 0 11 0 793812 46644 610 0 0 0 776 0 18 0 411 2718 211 1 >3 96 > 0 11 0 794232 46264 1668 0 0 0 1335 0 174 0 463 5620 647 3 >10 87 > 0 11 0 793772 46548 793 0 0 0 700 0 15 0 397 3265 301 2 >4 95 > 2 12 0 796476 46344 970 0 0 0 869 0 20 0 393 3871 353 1 >6 93 > 0 12 0 796900 45560 1390 0 0 0 935 0 16 0 381 9099 577 22 >7 71 > 2 15 0 799664 43432 2392 0 0 0 1619 0 32 0 469 9379 704 2 >10 87 > 2 14 0 803256 42372 3131 0 0 0 2794 0 73 0 579 13544 1495 >4 19 77 > 1 14 0 803124 42348 2299 0 0 0 2043 0 65 0 541 10809 1173 >4 13 83 > 1 15 0 803348 41624 2614 0 0 0 2385 0 24 0 448 9002 549 4 >9 87 > 0 15 0 801100 42108 1099 0 0 0 973 0 27 0 470 6665 554 5 >8 87 > 0 17 0 801320 42100 1219 0 0 0 983 0 37 0 442 5557 533 1 >6 92 > 4 16 0 809912 38136 3095 0 0 0 2126 0 89 0 580 13160 1319 >5 16 79 > 2 16 0 810028 59412 3107 0 0 0 3472 5560 109 0 521 11596 799 >3 16 81 > >systat -v: > >Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER > Tot Share Tot Share Free in out in out >Act 424440 7376 793348 16896 83652 count >All 998604 11808 2012876 28536 pages > 715 zfod Interrupts >Proc:r p d s w Csw Trp Sys Int Sof Flt 478 cow 486 >tota 12 64 757 1792 7409 486 319 1702 185992 wire ata0 irq14 509556 act 48 > mux irq2 > 7.5%Sys 0.5%Intr 7.9%User 0.0%Nice 84.2%Idl 248948 inact ahc1 >irq5 | | | | | | | | | | 54108 cache 210 >xl0 irq9 >====>>>> 29544 free fdc0 >irq6 daefr >atkbd0 irqNamei Name-cache Dir-cache 1049 prcfr >sio0 irq4 > Calls hits % hits % react 100 clk >irq0 > 12376 11927 96 5 0 pdwak 128 rtc >irq8 > pdpgs Disks >da0 acd0 fd0 pass0 md0 intrn >KB/t 10.33 0.00 0.00 0.00 0.00 114480 buf tps 48 >0 0 0 0 666 dirtybuf >MB/s 0.48 0.00 0.00 0.00 0.00 70423 desiredvnodes >% busy 17 0 0 0 0 81095 numvnodes > 63089 freevnodes > >Thanks, > >Charles > >-- >Charles Sprickman >spork@inch.com > >_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > > > > -- Jaco van Tonder Software Developer Direct: +27 11 312 2122 :: Fax: +27 11 312 2122 :: Mobile: +27 83 417 5424 Email: jaco@coocoo.za.net :: Web: http://www.coocoo.za.net/ Real Software Engineers don't debug programs; they verify correctness. This process doesn't necessarily involve execution of anything on a computer, except perhaps a Correctness Verification Aid package.