From owner-cvs-src@FreeBSD.ORG Wed Dec 20 04:40:41 2006 Return-Path: X-Original-To: cvs-src@FreeBSD.org Delivered-To: cvs-src@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 540E116A504; Wed, 20 Dec 2006 04:40:41 +0000 (UTC) (envelope-from davidxu@FreeBSD.org) Received: from repoman.freebsd.org (repoman.freebsd.org [69.147.83.41]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB39E43CA5; Wed, 20 Dec 2006 04:40:39 +0000 (GMT) (envelope-from davidxu@FreeBSD.org) Received: from repoman.freebsd.org (localhost [127.0.0.1]) by repoman.freebsd.org (8.13.6/8.13.6) with ESMTP id kBK4edrB041546; Wed, 20 Dec 2006 04:40:39 GMT (envelope-from davidxu@repoman.freebsd.org) Received: (from davidxu@localhost) by repoman.freebsd.org (8.13.6/8.13.4/Submit) id kBK4edKi041545; Wed, 20 Dec 2006 04:40:39 GMT (envelope-from davidxu) Message-Id: <200612200440.kBK4edKi041545@repoman.freebsd.org> From: David Xu Date: Wed, 20 Dec 2006 04:40:39 +0000 (UTC) To: src-committers@FreeBSD.org, cvs-src@FreeBSD.org, cvs-all@FreeBSD.org X-FreeBSD-CVS-Branch: HEAD Cc: Subject: cvs commit: src/sys/amd64/amd64 cpu_switch.S genassym.c machdep.c src/sys/i386/i386 genassym.c machdep.c swtch.s src/sys/ia64/ia64 machdep.c src/sys/kern kern_umtx.c src/sys/sys pcpu.h umtx.h X-BeenThere: cvs-src@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: CVS commit messages for the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Dec 2006 04:40:41 -0000 davidxu 2006-12-20 04:40:39 UTC FreeBSD src repository Modified files: sys/amd64/amd64 cpu_switch.S genassym.c machdep.c sys/i386/i386 genassym.c machdep.c swtch.s sys/ia64/ia64 machdep.c sys/kern kern_umtx.c sys/sys pcpu.h umtx.h Log: Add a lwpid field into per-cpu structure, the lwpid represents current running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130 Revision Changes Path 1.155 +5 -3 src/sys/amd64/amd64/cpu_switch.S 1.160 +2 -0 src/sys/amd64/amd64/genassym.c 1.667 +1 -0 src/sys/amd64/amd64/machdep.c 1.156 +2 -0 src/sys/i386/i386/genassym.c 1.646 +1 -0 src/sys/i386/i386/machdep.c 1.153 +2 -0 src/sys/i386/i386/swtch.s 1.214 +3 -0 src/sys/ia64/ia64/machdep.c 1.58 +70 -0 src/sys/kern/kern_umtx.c 1.20 +1 -0 src/sys/sys/pcpu.h 1.28 +2 -1 src/sys/sys/umtx.h