Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 07 Jul 2004 14:17:19 +0200
From:      Jaco van Tonder <jaco@coocoo.za.net>
To:        stable@freebsd.org
Subject:   Re: mysql slowdown under load
Message-ID:  <40EBE9CF.50001@coocoo.za.net>
In-Reply-To: <20040706193151.K6269@shell.inch.com>
References:  <20040706193151.K6269@shell.inch.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Charles Sprickman typed on a keyboard not too far away, on 7/7/2004 1:56 
AM:

>Hi,
>
>I've been trying to track down this problem for some time now.  Since I
>have a temporary fix in place, I can breathe a bit and hope someone here
>has some answers (questions are good too!).
>
>I have a dual Athlon 1600MP box running 4.8-p3.  It's a mail server
>running qmail with vpopmail on top of that, backed by mysql for storing
>user info (user/pass/homedir, etc.).  Webmail is currently served off of
>the box as well (waiting for a vpopmail upgrade to peel this service off
>the box).
>
>In the past week we've had complaints from users about having to re-enter
>their POP passwords, some bounces of legit mail (chkuser patch on
>qmail-smtpd gives a 550 if it can't do a lookup, blech), and password
>failures on webmail.   After some digging, I was able to figure out that
>this all came down to mysql not returning an answer fast enough.
>  
>
All I can say is that I have the exact same problem with the exact same 
software.
Also, this machine's load rarely goes above 2.00 (ie, the only time it 
does is when I build kernel or world).
This machine also NEVER swaps.

>The box generally runs a load of about .70, peaks at 3-4 during heavy spam
>runs.  Each mail delivery can theoretically generate about 6 sql queries
>(don't ask - I wouldn't likely do vpopmail again if I had a choice).
>There are about 3,000 active mail users on here.  Even at high loads, the
>box is not swapping at all, and interactive response such as webmail and
>normal IMAP sessions is still excellent.   But the higher the load goes,
>the more problems mysql seems to have.  The slow query log grows, even
>simple db "pings" can take up to 45 seconds to complete.
>
>Mysql is built from ports, so far I've had these versions running:
>
>4.0.18 w/LinuxThreads
>4.0.20 w/LinuxThreads
>4.0.20 w/native threads
>
>There was no difference going from .18 to .20, but going from LinuxThreads
>to system threads did bring a bit of relief.
>
>My vague hypothesis here (based on past work on larger scale qmail boxes)
>is that at some point the kernel just gets a bit "lost" as to what
>processes get prioritized; qmail with it's "hand the mail from process to
>process" architecture causes lots of processes to be spawned then
>destroyed.  It seems to me that the system gets so busy with spawning all
>of this that mysql languishes.
>
>Looking at iostat, vmstat and systat, it seems like the number of syscalls
>is very high, as is the "context switches" number.  However memory is fine
>(no paging in/out to disk), cpu is fine, disk i/o is fairly low
>(underlying storage is 4 disks on an adaptec raid controller, raid 1+0),
>but mysql continues to "stall".  Mysqladmin and some mrtg graphing shows
>that I'm doing about 15 queries/second peak (which seems pretty low
>compared to past work I've done with mysql).
>
>I saw a previous thread about this problem that never really came to any
>good conclusions; the tester cron'd a script that just connected to mysql,
>ran a simple query and recorded whether the connection failed or not and
>how long it took.  Even under very low load connects would timeout or take
>a long time to complete.
>
>My band-aid is to replicate the db to another box and use that for
>vpopmail for all non-update/insert queries.  So far, so good, but my
>curiousity has the better of me on this; it seems like I should be able to
>run a local mysql server with such a low Q/S rate without issue.
>  
>
I too would like to know why this happens.

>This could also be fixed in vpopmail; the design there is not very robust
>in the face of db failures.
>  
>
Apparently there is a fix in the latest version of Vpopmail, but I had a 
look at the code and they
basically just moved the order in which the connection to the database 
is made. So, no real fix here.
You will just not get the "Unable to create database" message anymore. :(

>Here's a few stats, perhaps something jumps out at someone.   And if
>anyone can explain exactly what the high level of syscalls indicates (and
>how much is too much) and what "context switches" are and how they figure
>in, that would be great.
>
>iostat:
>
>      tty             da0              md0             cpu
> tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
>  -0  -86  0.00   0  0.00   0.00   0  0.00 -91 -3-108 -5308
>   0   30 12.00  50  0.58   0.00   0  0.00   2  0  6  0 92
>   0   30 10.64  90  0.94   0.00   0  0.00   5  0  6  0 89
>   0   30  8.44  45  0.37   0.00   0  0.00   3  0  7  0 90
>   0   30  9.67  60  0.57   0.00   0  0.00   3  0  7  0 90
>   0   30 11.68  37  0.42   0.00   0  0.00   1  0  3  0 95
>   0   30  0.00   0  0.00   0.00   0  0.00   3  0  2  0 95
>   0   30 10.41  37  0.37   0.00   0  0.00   4  0  8  0 87
>   0   30 11.78   9  0.10   0.00   0  0.00   0  0  3  0 97
>   0   30  5.00   1  0.00   0.00   0  0.00   1  0  1  0 98
>   0   30  8.23 461  3.71   0.00   0  0.00   1  0  5  0 94
>
>vmstat (sorry about the wrap - "sy" and "cs" concern me here):
>
> procs      memory      page                    disks     faults      cpu
> r b w     avm    fre  flt  re  pi  po  fr  sr da0 md0   in   sy  cs us sy
>id
> 1 12 0  800380  43788  135   0   0   0 206  21   0   0  159   72 187 -94
>-114 308
> 1 12 0  795304  45700 1656   0   0   0 1919   0  75   0  509 8289 533  3
>8 89
> 0 11 0  793812  46644  610   0   0   0 776   0  18   0  411 2718 211  1
>3 96
> 0 11 0  794232  46264 1668   0   0   0 1335   0 174   0  463 5620 647  3
>10 87
> 0 11 0  793772  46548  793   0   0   0 700   0  15   0  397 3265 301  2
>4 95
> 2 12 0  796476  46344  970   0   0   0 869   0  20   0  393 3871 353  1
>6 93
> 0 12 0  796900  45560 1390   0   0   0 935   0  16   0  381 9099 577 22
>7 71
> 2 15 0  799664  43432 2392   0   0   0 1619   0  32   0  469 9379 704  2
>10 87
> 2 14 0  803256  42372 3131   0   0   0 2794   0  73   0  579 13544 1495
>4 19 77
> 1 14 0  803124  42348 2299   0   0   0 2043   0  65   0  541 10809 1173
>4 13 83
> 1 15 0  803348  41624 2614   0   0   0 2385   0  24   0  448 9002 549  4
>9 87
> 0 15 0  801100  42108 1099   0   0   0 973   0  27   0  470 6665 554  5
>8 87
> 0 17 0  801320  42100 1219   0   0   0 983   0  37   0  442 5557 533  1
>6 92
> 4 16 0  809912  38136 3095   0   0   0 2126   0  89   0  580 13160 1319
>5 16 79
> 2 16 0  810028  59412 3107   0   0   0 3472 5560 109   0  521 11596 799
>3 16 81
>
>systat -v:
>
>Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
>        Tot   Share      Tot    Share    Free         in  out     in  out
>Act  424440    7376   793348    16896   83652 count
>All  998604   11808  2012876    28536         pages
>                                                      715 zfod  Interrupts
>Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt    478 cow     486
>tota      12 64       757 1792 7409  486  319 1702 185992 wire        ata0 irq14                                                   509556 act      48
>                                                                      mux irq2
> 7.5%Sys   0.5%Intr  7.9%User  0.0%Nice 84.2%Idl   248948 inact       ahc1
>irq5 |    |    |    |    |    |    |    |    |    |      54108 cache   210
>xl0 irq9
>====>>>>                                            29544 free        fdc0
>irq6                                                           daefr
>atkbd0 irqNamei         Name-cache    Dir-cache                1049 prcfr
>sio0 irq4
>    Calls     hits    %     hits    %                     react   100 clk
>irq0
>    12376    11927   96        5    0                     pdwak   128 rtc
>irq8
>                                                          pdpgs Disks
>da0  acd0   fd0 pass0   md0                       intrn
>KB/t  10.33  0.00  0.00  0.00  0.00                114480 buf tps      48
>0     0     0     0                   666 dirtybuf
>MB/s   0.48  0.00  0.00  0.00  0.00                 70423 desiredvnodes
>% busy   17     0     0     0     0                 81095 numvnodes
>                                                    63089 freevnodes
>
>Thanks,
>
>Charles
>
>--
>Charles Sprickman
>spork@inch.com
>
>_______________________________________________
>freebsd-stable@freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-stable
>To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"
>
>
>  
>


-- 
Jaco van Tonder
Software Developer
Direct: +27 11 312 2122 :: Fax: +27 11 312 2122 :: Mobile: +27 83 417 5424
Email: jaco@coocoo.za.net :: Web: http://www.coocoo.za.net/
Real Software Engineers don't debug programs; they verify correctness.
This process doesn't necessarily involve execution of anything on a
computer, except perhaps a Correctness Verification Aid package.




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?40EBE9CF.50001>