From owner-freebsd-arch@FreeBSD.ORG Sat Apr 17 22:26:42 2010 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC542106564A for ; Sat, 17 Apr 2010 22:26:42 +0000 (UTC) (envelope-from kmatthew.macy@gmail.com) Received: from mail-qy0-f199.google.com (mail-qy0-f199.google.com [209.85.221.199]) by mx1.freebsd.org (Postfix) with ESMTP id 834028FC0C for ; Sat, 17 Apr 2010 22:26:42 +0000 (UTC) Received: by qyk37 with SMTP id 37so3262518qyk.8 for ; Sat, 17 Apr 2010 15:26:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:reply-to:received:date :x-google-sender-auth:received:message-id:subject:from:to:cc :content-type; bh=PjnAcQjqGFZ8pYrjRPHPdnklBVDduDRO0krJin4wgH4=; b=xS4AhzhTHxmyhDyubfmOERTBU4MGKqax7KJo0eo5iB17MjDmGlHcYVHNgHL2vezaf1 tEF+dutCfkG0V9n6XZEVC8WexbtWsFo1/dDBRbvj9lMQIiaAZJ3Q4phKOVie2wODDmDu izFZJb3VPoibskEYwq4WDHUwvT3Waf58bIIjY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:date:x-google-sender-auth:message-id :subject:from:to:cc:content-type; b=he+TKMQFqc/sNYE2jrnS6f0FAu6SpRrbPKZ2veoKNen1BmRpkxpeID+dK5Cl/t6cJi YKmvxNevpXjYr7+hRVSaedqZA7I3KGjVhqBYcY4T9o1VvLBf1ZR1CoxnVwxXUMtjRBR1 UbfIwfKE458kGsuofdpFvUdFrR1D2z9xkJAJ0= MIME-Version: 1.0 Sender: kmatthew.macy@gmail.com Received: by 10.229.226.6 with HTTP; Sat, 17 Apr 2010 14:55:30 -0700 (PDT) Date: Sat, 17 Apr 2010 14:55:30 -0700 X-Google-Sender-Auth: 37b9bca1f4275dab Received: by 10.229.88.72 with SMTP id z8mr2100122qcl.3.1271541330653; Sat, 17 Apr 2010 14:55:30 -0700 (PDT) Message-ID: From: "K. Macy" To: freebsd-arch@freebsd.org Content-Type: multipart/mixed; boundary=0016364eead454b468048475c972 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: jeff@freebsd.org, alc@cs.rice.edu Subject: Moving forward with vm page lock X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: kmacy@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Apr 2010 22:26:42 -0000 --0016364eead454b468048475c972 Content-Type: text/plain; charset=ISO-8859-1 Last February Jeff Roberson first shared his vm page lock patch with me. The general premise is that modification of a vm_page_t is no longer protected by the global "vm page queue mutex" but is instead protected by an entry in an array of locks which each vm_page_t is hashed to by its physical address. This complicates pmap somewhat because it increases the number of cases where retry logic is required if we need to drop the pmap lock in order to first acquire the page lock (see pa_tryrelock). I've continued refining Jeff's initial page lock patch by resolving lock ordering issues in vm_pageout, eliminating pv_lock, and eliminating the need for pmap_collect on amd64. Rather than exposing ourselves to a race condition by dropping the locks in pmap_collect, I pre-allocate any necessary pv_entrys before changing any pmap state. This complicated calls to demote slightly, but that can probably be simplified later. Currently only amd64 supports this. Other platforms map vm_page_lock(m) to the vm page queue mutex. The current version of the patch can be found at: http://people.freebsd.org/~kmacy/diffs/head_page_lock.diff I've been refining it in a subversion branch at: svn://svn.freebsd.org/base/user/kmacy/head_page_lock On my workloads at a CDN startup I've seen as much as a 50% increase in lighttpd throughput (3.2Gbps -> 4.8Gbps). At Jeff's request I've done some basic measurements with buildkernel to demonstrate that, at least on my hardware, a dual 4-core "CPU: Intel(R) Xeon(R) CPU L5420 @ 2.50GHz (2500.01-MHz K8-class CPU)" with 64GB of RAM there is no performance regression. I did 2 warm up runs followed by 10 samples of "time make -j16 buildkernel KERNCONF=GENERIC -DNO_MODULES -DNO_KERNELCONFIG -DNO_KERNELDEPEND" on a ZFS file system on a twa based raid device for both with page_lock and without. Wall clock time is consistently just under a second lower (faster build time) for the page_lock kernel. The bulk of the time is actually spent in user so it is more meaningful to compare system times. I attached the logs of the runs and the two files I fed to ministat. ministat -c 95 -w 72 base page_lock x base + page_lock +------------------------------------------------------------------------+ | + ++ | |+ ++ +++ + x xxxx xxxxx| | |__AM__| |___AM__| | +------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 10 47.35 49.09 48.64 48.417 0.53416706 + 10 40.04 41.52 40.98 40.844 0.41494846 Difference at 95.0% confidence -7.573 +/- 0.449396 -15.6412% +/- 0.928179% (Student's t, pooled s = 0.478287) ramsan2.lab1# head -2 prof.out debug.lock.prof.stats: max wait_max total wait_total count avg wait_avg cnt_hold cnt_lock name ramsan2.lab1# sort -nrk 4 prof.out | head 1592 243918 1768980 12026988 287680 6 41 0 112005 /usr/home/kmacy/head_page_lock/sys/vm/vm_page.c:1065 (sleep mutex:vm page queue free mutex) 3967 750285 1678130 9447247 276594 6 34 0 104952 /usr/home/kmacy/head_page_lock/sys/vm/vm_page.c:1388 (sleep mutex:vm page queue mutex) 18234 163969 5417360 9213400 282459 19 32 0 6548 /usr/home/kmacy/head_page_lock/sys/amd64/amd64/pmap.c:3372 (sleep mutex:page lock) 173094 134890 18226507 8195920 49757 366 164 0 625 /usr/home/kmacy/head_page_lock/sys/kern/vfs_subr.c:2091 (lockmgr:zfs) 254 167136 38222 5153728 2736 13 1883 0 2333 /usr/home/kmacy/head_page_lock/sys/amd64/amd64/pmap.c:550 (sleep mutex:page lock) 1160 104774 1624269 4380034 279240 5 15 0 107998 /usr/home/kmacy/head_page_lock/sys/vm/vm_page.c:1508 (sleep mutex:vm page queue free mutex) 1107 80128 1581048 3377896 274341 5 12 0 100130 /usr/home/kmacy/head_page_lock/sys/vm/vm_page.c:1300 (sleep mutex:vm page queue mutex) 104802 284128 14712290 2970729 259423 56 11 0 1900 /usr/home/kmacy/head_page_lock/sys/vm/vm_object.c:721 (sleep mutex:page lock) 84339 158037 1455568 2875384 85147 17 33 0 292 /usr/home/kmacy/head_page_lock/sys/kern/vfs_cache.c:390 (rw:Name Cache) 9 995901 236 2468160 46 5 53655 0 45 /usr/home/kmacy/head_page_lock/sys/kern/sched_ule.c:2552 (spin mutex:sched lock 4) Both Giovanni Trematerra and I have run stress2 on it for extended periods with problems in evidence. I'd like to see this go in to HEAD by the end of this month. Once this change has proven to be stable by a wider audience I will extend it to i386. Thanks, Kip --0016364eead454b468048475c972--