From owner-freebsd-performance@FreeBSD.ORG Mon May 8 10:43:37 2006 Return-Path: X-Original-To: performance@freebsd.org Delivered-To: freebsd-performance@FreeBSD.ORG Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id D34D816A402; Mon, 8 May 2006 10:43:36 +0000 (UTC) (envelope-from davidxu@freebsd.org) From: David Xu To: freebsd-performance@freebsd.org Date: Mon, 8 May 2006 18:43:31 +0800 User-Agent: KMail/1.8.2 References: <20060506150622.C17611@fledge.watson.org> <20060507230430.GA6872@xor.obsecurity.org> <20060508065207.GA20386@xor.obsecurity.org> In-Reply-To: <20060508065207.GA20386@xor.obsecurity.org> MIME-Version: 1.0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200605081843.31825.davidxu@freebsd.org> Cc: Robert Watson , performance@freebsd.org, current@freebsd.org, Kris Kennaway Subject: Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 08 May 2006 10:43:37 -0000 On Monday 08 May 2006 14:52, Kris Kennaway wrote: > OK, David's patch fixes the umtx thundering herd (and seems to give a > 4-6% boost). I also fixed a thundering herd in FILEDESC_UNLOCK (which > was also waking up 2-7 CPUs at once about 30% of the time) by doing > s/wakeup/wakeup_one/. This did not seem to give a performance impact > on this test though. >.... > filedesc contention is down by a factor of 3-4, with corresponding > reduction in the average hold time. The process lock contention > coming from the signal delivery wakeup has also gone way down for some > reason. > I found that mysqld frequently calls alarm() in its file thr_alarm.c and thr_kill() to send SIGALRM to its timer thread to wake it up, the timer thread itself is being blocked in sigwait(), normally the alarm timer will be expired in a second, so the kernel will periodically call psignal to find a thread which can handle the signal, it means kernel has to periodically walk through thread list with process lock and scheduler held, this is very expensive. thr_kill will in most time wake up the timer thread earlier, in thr_kill syscall, kernel has to walk through thread list to find a thread whose thread is matching the given id, the function thread_find() uses a linear searching algorithm, it is slow, if there are lots of thread in the process, the process lock will be holden too long, I think that's the reason why you have seen so many process lock contention, if you define USE_ALARM_THREAD in mysql header file, the contention should be decreased ( I hope ), patch: --- my_pthread.h.old Mon May 8 18:16:56 2006 +++ my_pthread.h Mon May 8 18:17:07 2006 @@ -267,6 +267,8 @@ /* Test first for RTS or FSU threads */ +#define USE_ALARM_THREAD + #if defined(PTHREAD_SCOPE_GLOBAL) && !defined(PTHREAD_SCOPE_SYSTEM) #define HAVE_rts_threads extern int my_pthread_create_detached; > unp contention has risen a bit. The other big gain is to sleep > mtxpool contention, which roughly doubled: > > /* > * Change the total socket buffer size a user has used. > */ > int > chgsbsize(uip, hiwat, to, max) > struct uidinfo *uip; > u_int *hiwat; > u_int to; > rlim_t max; > { > rlim_t new; > > UIDINFO_LOCK(uip); > > So the next question is how can that be optimized? > may use atomic_cmpset_int in a loop to avoid context switch or use an adaptive mutex, but there is no adaptive mutex type you can specify. rlim_t is a 64bit integer, so atomic operation can not be used, but 64bit integer might not be necessary for socket buffer size. > Kris David Xu