From owner-freebsd-mips@FreeBSD.ORG Mon Oct 1 16:57:05 2012 Return-Path: Delivered-To: mips@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B83D106564A for ; Mon, 1 Oct 2012 16:57:05 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30]) by mx1.freebsd.org (Postfix) with ESMTP id 4C95B8FC15 for ; Mon, 1 Oct 2012 16:57:05 +0000 (UTC) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id 7150860506; Mon, 1 Oct 2012 11:57:04 -0500 (CDT) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id 6EEB5604C4; Mon, 1 Oct 2012 11:57:04 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel Received: from mh10.mail.rice.edu ([127.0.0.1]) by mh10.mail.rice.edu (mh10.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id zjwXfORq4lo1; Mon, 1 Oct 2012 11:57:04 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh10.mail.rice.edu (Postfix) with ESMTPSA id 07FF3603E0; Mon, 1 Oct 2012 11:57:03 -0500 (CDT) Message-ID: <5069CB5F.40100@rice.edu> Date: Mon, 01 Oct 2012 11:57:03 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: "Jayachandran C." References: <505DE9D4.5010204@rice.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: mips@freebsd.org, Alan Cox Subject: Re: optimizing TLB invalidations X-BeenThere: freebsd-mips@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to MIPS List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 01 Oct 2012 16:57:05 -0000 On 10/01/2012 11:16, Jayachandran C. wrote: > On Sat, Sep 22, 2012 at 10:09 PM, Alan Cox wrote: >> Can you please test the attached patch? It introduces a new TLB >> invalidation function for efficiently invalidating address ranges and uses >> this function in pmap_remove(). >> >> Basically, the function looks at the size of the address range in order to >> decide how best to perform the invalidation. If the range is small compared >> to the TLB size, it probes the TLB for pages in the range. That said, the >> function understands that pages come in pairs, and so it won't probe for odd >> page numbers. In contrast, the current code in pmap_remove() will probe for >> both the even and odd page. On the other hand, if the range is large, then >> the function changes its approach. It iterates over the TLB entries >> checking each to see if it falls within the range. This can eliminate an >> enormous number of TLB probes when a large virtual address range is >> unmapped. Finally, on a multiprocessor, this change will reduce the number >> of IPIs to invalidate TLB entries. There will be one IPI per range rather >> than one per page. >> >> Ultimately, this new function could be applied elsewhere, like >> pmap_protect(), but that's a patch for another day. > Tested this on my XLP 64 bit SMP config, and did not any issues. The > compilation test did not show much change in performance, but I think > I need to run a multi-threaded benchmark to see the performance > improvement. > Yes, I agree. Under a compilation test, the FreeBSD malloc(3)/free(3) implementation will occasionally release a large chunk of memory (4MB) back to the kernel. If all of that chunk was used, then we'll save about 900 or so TLB probes. But, this doesn't happen very often. Under a compilation workload, most of the bulk destruction of mappings happens in pmap_remove_pages(), not pmap_remove(). Probably the place where you'll see an easily discernible effect is when pmap_qremove() is modified to use the ranged TLB invalidation. pmap_qremove() gets used when we unmap data from the buffer cache and must shootdown every CPU. Alan