Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Feb 2004 16:05:06 -0800 (PST)
From:      Julian Elischer <julian@elischer.org>
To:        Kris Gale <kris-fbsd@asn.net>
Cc:        freebsd-threads@freebsd.org
Subject:   Re: More on MySQL -- Fatal trap 12
Message-ID:  <Pine.BSF.4.21.0402171544310.81059-100000@InterJet.elischer.org>
In-Reply-To: <56666.68.106.19.246.1077058348.squirrel@mail.asn.net>

next in thread | previous in thread | raw e-mail | index | archive | help


On Tue, 17 Feb 2004, Kris Gale wrote:

> Hey Everyone,
> 
> I've been trying to create a simple program to simulate
> the load my production environment puts on MySQL.


great!


> 
> What I seem to be seeing is a bogging down of MySQL
> when new threads are being created in bursts.  This
> causes MySQL to temporarily become unresponsive,
> and will sometimes crash the whole system.

My  first question is:
"In what state are these threads waiting for work?
are they in the kernel, or are they in userland?
is there a single thread that listens on a socket and then hands the
work to worker threads using some userland synchronisation, or do the
threads enter the kernel and wait on the sockets themselves?
(i.e. what does ps -H show?) "




> 
> Here's what my test program is doing:
> 
> - Fork X number of child processes, each opening
> Y number of connections to the database.

WHere are the threads coming into this? Is it a htreaded server or a
threaded test program openning tcp  connections to the server?

what happens if you have 1 process with X*Y threads?
(just curious).

> 
> - Each child process loops through the Y connections it
> has open, executing one select statement for each, then
> starting over from the first.
> 
> - If a particular database connection drops, it will enter
> a loop attempting to reconnect, forever.
> 
> When using 45 child processes and 20 connections for
> each, everything is fine.  (900 threads)
 All talking to teh same database server?

is the server threaded process you are debugging, or is the
test program what is being debugged?




> 
> If I bump it up to 90 children and 20 connections, I
> start to see problems.  The database is unable to
> serve the incoming connections fast enough, and
> existing connections become slow or entirely
> unresponsive.  However, if I leave it alone, eventually
> things "catch up."*  That is, as the database server
> slowly manages to create new threads, all of the
> incoming connect requests eventually succeed
> (remember, they're looping).

So it looks like you are trying to debug threads in the server..
what state are threads in that are not doing work?

thread creation has several parts depending on whether the  threads are
to run in the kernel or not.. Actually creating the threads in userland
takes memory allocations  and structure munging.. making them come to
life may require the creation of a new thread in the kernel. Threads are 
held in 'UMA' and are cached.. 
use vmstat -z to check the values for:

UPCALL:           40,        0,      0,      0,        0
KSE:              64,        0,    200,    110,      200
KSEGRP:          112,        0,    200,     52,      200
THREAD:          236,        0,    200,     24,      200
PROC:            508,        0,    159,     41,   672667

To help you account for the m, you should know that 
each process comes with a single preallocated KSEG , KSE and thread.
so 159+41 processes are created, (159 in use) meaning that 
there are 200 KSE, KSEGroup and thread structures
associated with them, (even teh ones not yet in use).

>  Once everything is
> reconnected, I see 1800 threads in MySQL, and the
> same query/second rate that I saw with 900 threads.
> 
> * Okay, not always.  About half of the time, once
> MySQL falls behind the incoming connections, and
> connect attempts start to fail, the system will crash
> with a "fatal trap 12: page fault while in kernel mode"

"where" in the kernel?

> 
> In the X=90, Y=20 scenario (1800 threads), if the
> test is allowed to continue until everything catches
> up (about 5-10 minutes with KSE), I can stop and
> start the test, triggering the burst of connection
> attempts, but I see only a handful of connect errors.
> However, if I stop and start mysql, I'll see the 10
> minutes of connect errors again.

This suggests that the slowness is in malloc'ing 1800 stacks
etc. but much depends on what stat those threads are in
when not working..
Also, are they system scope threds or process scope? 

what happens with libthr?




> 
> This seems to imply that somehow these threads
> are being cached, or something is happening that
> allows us to skip whatever bottleneck was causing
> things to bog down.

yes threads are cached both in userland and in the kernel
> 
> Does this look like a fixable problem with KSE
> to anyone on this list?

of course it's fixable.. we just needd more info :-)

> 
> Let me know if you'd like a copy of the perl script
> I've written to try out all of these things.
> 
> Kris Gale
> _______________________________________________
> freebsd-threads@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-threads
> To unsubscribe, send any mail to "freebsd-threads-unsubscribe@freebsd.org"
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.21.0402171544310.81059-100000>