Date: Mon, 8 May 2006 18:43:31 +0800 From: David Xu <davidxu@freebsd.org> To: freebsd-performance@freebsd.org Cc: Robert Watson <rwatson@freebsd.org>, performance@freebsd.org, current@freebsd.org, Kris Kennaway <kris@obsecurity.org> Subject: Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets) Message-ID: <200605081843.31825.davidxu@freebsd.org> In-Reply-To: <20060508065207.GA20386@xor.obsecurity.org> References: <20060506150622.C17611@fledge.watson.org> <20060507230430.GA6872@xor.obsecurity.org> <20060508065207.GA20386@xor.obsecurity.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On Monday 08 May 2006 14:52, Kris Kennaway wrote: > OK, David's patch fixes the umtx thundering herd (and seems to give a > 4-6% boost). I also fixed a thundering herd in FILEDESC_UNLOCK (which > was also waking up 2-7 CPUs at once about 30% of the time) by doing > s/wakeup/wakeup_one/. This did not seem to give a performance impact > on this test though. >.... > filedesc contention is down by a factor of 3-4, with corresponding > reduction in the average hold time. The process lock contention > coming from the signal delivery wakeup has also gone way down for some > reason. > I found that mysqld frequently calls alarm() in its file thr_alarm.c and thr_kill() to send SIGALRM to its timer thread to wake it up, the timer thread itself is being blocked in sigwait(), normally the alarm timer will be expired in a second, so the kernel will periodically call psignal to find a thread which can handle the signal, it means kernel has to periodically walk through thread list with process lock and scheduler held, this is very expensive. thr_kill will in most time wake up the timer thread earlier, in thr_kill syscall, kernel has to walk through thread list to find a thread whose thread is matching the given id, the function thread_find() uses a linear searching algorithm, it is slow, if there are lots of thread in the process, the process lock will be holden too long, I think that's the reason why you have seen so many process lock contention, if you define USE_ALARM_THREAD in mysql header file, the contention should be decreased ( I hope ), patch: --- my_pthread.h.old Mon May 8 18:16:56 2006 +++ my_pthread.h Mon May 8 18:17:07 2006 @@ -267,6 +267,8 @@ /* Test first for RTS or FSU threads */ +#define USE_ALARM_THREAD + #if defined(PTHREAD_SCOPE_GLOBAL) && !defined(PTHREAD_SCOPE_SYSTEM) #define HAVE_rts_threads extern int my_pthread_create_detached; > unp contention has risen a bit. The other big gain is to sleep > mtxpool contention, which roughly doubled: > > /* > * Change the total socket buffer size a user has used. > */ > int > chgsbsize(uip, hiwat, to, max) > struct uidinfo *uip; > u_int *hiwat; > u_int to; > rlim_t max; > { > rlim_t new; > > UIDINFO_LOCK(uip); > > So the next question is how can that be optimized? > may use atomic_cmpset_int in a loop to avoid context switch or use an adaptive mutex, but there is no adaptive mutex type you can specify. rlim_t is a 64bit integer, so atomic operation can not be used, but 64bit integer might not be necessary for socket buffer size. > Kris David Xu
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200605081843.31825.davidxu>