From owner-freebsd-performance@FreeBSD.ORG Sun May 7 08:07:46 2006 Return-Path: X-Original-To: freebsd-performance@freebsd.org Delivered-To: freebsd-performance@freebsd.org Received: from localhost.my.domain (localhost [127.0.0.1]) by hub.freebsd.org (Postfix) with ESMTP id 5A36416A400; Sun, 7 May 2006 08:07:14 +0000 (UTC) (envelope-from davidxu@freebsd.org) From: David Xu To: freebsd-performance@freebsd.org Date: Sun, 7 May 2006 16:07:10 +0800 User-Agent: KMail/1.8.2 References: <20060506150622.C17611@fledge.watson.org> In-Reply-To: <20060506150622.C17611@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200605071607.10869.davidxu@freebsd.org> Cc: performance@freebsd.org, Robert Watson , current@freebsd.org Subject: Re: Fine-grained locking for POSIX local sockets (UNIX domain sockets) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 07 May 2006 08:07:47 -0000 On Saturday 06 May 2006 22:16, Robert Watson wrote: > Dear all, > > Attached, please find a patch implementing more fine-grained locking for > the POSIX local socket subsystem (UNIX domain socket subsystem). In the > current implementation, we use a single global subsystem lock to protect > all local IPC over the PF_LOCAL socket type. This has low overhead, but > can result in significant contention, especially for workloads like MySQL > which push a great deal of data through UNIX domain sockets, and involve > many parties. The hope is that by improving granularity, we can reduce > contention sufficiently to overcome the added cost of increased locking > overhead (a side-effect of greater granularity). At a high level, here are > the changes made: I have tested the patch on my dual PIII machine, the test is super-smack's select-key.smack, as you said, performance is improved. it seems uidinfo lock is one of the bottlenecks in my test, it might be caused by chgsbsize() because it seems each socket I/O has to execute the code though they don't have any relations. Note unlike kris, I don't have local changes. max total count avg cnt_hold cnt_lock name 2913 1110657 1200300 0 25690 34374 kern/kern_resource.c:1172 (sleep mtxpool) 5603 7259406 204321 35 20899 20075 kern/kern_descrip.c:378 (Giant) 6987 1817410 1369257 1 10739 7324 kern/kern_descrip.c:1990 (filedesc structure) 3713 3771857 120053 31 4553 4612 kern/uipc_usrreq.c:617 (unp_global_mtx) 7339 1903685 1574656 1 3389 3710 kern/kern_descrip.c:2145 (sleep mtxpool) 91334 1798700 1369257 1 3227 7916 kern/kern_descrip.c:2011 (filedesc structure) 4764 223419 204440 1 2549 1693 kern/kern_descrip.c:385 (filedesc structure) 410546 2002932 1369257 1 2238 4103 kern/kern_descrip.c:2010 (sleep mtxpool) 5064 248004 169944 1 1152 1777 kern/kern_sig.c:1002 (process lock) 39 149033 74420 2 760 866 kern/kern_synch.c:220 (process lock) 5567 209654 204321 1 691 1566 kern/kern_descrip.c:438 (filedesc structure) 70 386915 63807 6 527 412 kern/subr_sleepqueue.c:374 (process lock) 4358 486842 66802 7 463 291 kern/vfs_bio.c:2424 (vnode interlock) 6430 488186 214393 2 347 420 kern/kern_lock.c:163 (lockbuilder mtxpool) 3057 251010 68159 3 313 2290 kern/vfs_vnops.c:796 (vnode interlock) 13126 15731880 1080092 14 294 227 kern/uipc_usrreq.c:581 (so_snd) 3161 67316 66402 1 293 267 kern/vfs_bio.c:1464 (vnode interlock) 3395 205078 204321 1 270 447 kern/kern_descrip.c:433 (sleep mtxpool) 3011 97692 2323 42 213 239 kern/kern_synch.c:218 (Giant) 71 9933 3721 2 185 185 i386/i386/pmap.c:2235 (vm page queue mutex) 14 3454 3512 0 121 155 vm/vm_fault.c:909 (vm page queue mutex) 3700 2120096 120046 17 119 22 kern/uipc_usrreq.c:705 (so_rcv) 9 2379 2024 1 103 121 vm/vm_fault.c:346 (vm page queue mutex) 9389 131186 120070 1 94 1 kern/uipc_socket.c:1101 (so_snd) 4 2400 3016 0 85 4385 vm/vm_fault.c:851 (vm page queue mutex) 99 6403 596 10 84 50 i386/i386/pmap.c:2649 (vm page queue mutex) 5109 201972 204330 0 77 306 kern/sys_socket.c:176 (so_snd) 1892 47770 2030 23 66 15 vm/vm_fault.c:686 (vm object) 3380 360770 66716 5 61 117 kern/vfs_bio.c:1445 (buf queue lock) 7 1597 1380 1 53 90 vm/vm_fault.c:136 (vm page queue mutex) 347 89307 8267 10 50 330 kern/kern_timeout.c:240 (Giant) 14 10199 12448 0 38 100 kern/kern_sx.c:157 (lockbuilder mtxpool) 15 10421 3046 3 37 32 vm/vm_fault.c:1009 (vm page queue mutex) 3704 2757298 120046 22 31 0 kern/uipc_usrreq.c:696 (unp_mtx) 27 1519 510 2 27 21 vm/vm_object.c:637 (vm page queue mutex) 29 7651 5374 1 26 44 geom/geom_io.c:68 (bio queue) 4517 25540 6495 3 23 250 kern/kern_umtx.c:194 (umtxq_lock) 39 1910 400 4 20 9 i386/i386/pmap.c:2126 (vm page queue mutex) 3 669 648 1 19 24 vm/vm_object.c:1551 (vm page queue mutex) 13346 24581 2592 9 18 36 vm/vnode_pager.c:1164 (vm object) 293 24130 3559 6 17 37 vm/vm_fault.c:994 (vm object) 11121 73890 994 74 17 8 kern/kern_intr.c:661 (Giant) 396 6808 141 48 17 4 vm/vm_map.c:1427 (vm page queue mutex) 2100 64657 66580 0 16 46 kern/vfs_bio.c:357 (needsbuffer lock) 4082 304716 66778 4 12 92 sys/buf.h:296 (lockbuilder mtxpool) 491 10312 12453 0 12 21 kern/kern_sx.c:245 (lockbuilder mtxpool) 86 1726 210 8 11 7 i386/i386/pmap.c:1966 (vm page queue mutex) 809 60435 66602 0 11 0 kern/vfs_bio.c:315 (needsbuffer lock)