From owner-svn-src-all@freebsd.org Thu May 31 09:56:03 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 1B38AEFB727; Thu, 31 May 2018 09:56:03 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C287B7B4B1; Thu, 31 May 2018 09:56:02 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A3983189F4; Thu, 31 May 2018 09:56:02 +0000 (UTC) (envelope-from mjg@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id w4V9u2tF084195; Thu, 31 May 2018 09:56:02 GMT (envelope-from mjg@FreeBSD.org) Received: (from mjg@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id w4V9u2rL084194; Thu, 31 May 2018 09:56:02 GMT (envelope-from mjg@FreeBSD.org) Message-Id: <201805310956.w4V9u2rL084194@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mjg set sender to mjg@FreeBSD.org using -f From: Mateusz Guzik Date: Thu, 31 May 2018 09:56:02 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r334419 - head/sys/amd64/amd64 X-SVN-Group: head X-SVN-Commit-Author: mjg X-SVN-Commit-Paths: head/sys/amd64/amd64 X-SVN-Commit-Revision: 334419 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 31 May 2018 09:56:03 -0000 Author: mjg Date: Thu May 31 09:56:02 2018 New Revision: 334419 URL: https://svnweb.freebsd.org/changeset/base/334419 Log: amd64: switch pagecopy from non-temporal stores to rep movsq The copied data is accessed in part soon after and it results with additional cache misses during a -j 1 buildkernel WITHOUT_CTF=yes KERNFAST=1, as measured with pmc stat. before: 256165411 cache-references # 0.003 refs/inst 15105408 cache-misses # 5.897% 20.70 real # 99.67% cpu 13.24 user # 63.94% cpu 7.40 sys # 35.73% cpu after: 256764469 cache-references # 0.003 refs/inst 11913551 cache-misses # 4.640% 20.70 real # 99.67% cpu 13.19 user # 63.73% cpu 7.44 sys # 35.95% cpu Note the real time did not change, but traffic to RAM was reduced (multiple measurements performed with switching the implementation at runtime). Since nobody else is using non-temporal for this and there is no apparent benefit at least these days, don't use them either. Side note is that pagecopy arguments should probably get reversed to not have to flip them around in the primitive. Discussed with: jeff Modified: head/sys/amd64/amd64/support.S Modified: head/sys/amd64/amd64/support.S ============================================================================== --- head/sys/amd64/amd64/support.S Thu May 31 09:11:21 2018 (r334418) +++ head/sys/amd64/amd64/support.S Thu May 31 09:56:02 2018 (r334419) @@ -281,26 +281,12 @@ END(memset) */ ENTRY(pagecopy) PUSH_FRAME_POINTER - movq $-PAGE_SIZE,%rax - movq %rax,%rdx - subq %rax,%rdi - subq %rax,%rsi -1: - prefetchnta (%rdi,%rax) - addq $64,%rax - jne 1b -2: - movq (%rdi,%rdx),%rax - movnti %rax,(%rsi,%rdx) - movq 8(%rdi,%rdx),%rax - movnti %rax,8(%rsi,%rdx) - movq 16(%rdi,%rdx),%rax - movnti %rax,16(%rsi,%rdx) - movq 24(%rdi,%rdx),%rax - movnti %rax,24(%rsi,%rdx) - addq $32,%rdx - jne 2b - sfence + movq $PAGE_SIZE/8,%rcx + movq %rdi,%r9 + movq %rsi,%rdi + movq %r9,%rsi + rep + movsq POP_FRAME_POINTER ret END(pagecopy)