From owner-svn-src-head@freebsd.org Tue May 19 16:04:28 2020 Return-Path: Delivered-To: svn-src-head@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 325332DCE6C; Tue, 19 May 2020 16:04:28 +0000 (UTC) (envelope-from andrew@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 49RLL40bCfz3W94; Tue, 19 May 2020 16:04:28 +0000 (UTC) (envelope-from andrew@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 0F7CDD905; Tue, 19 May 2020 16:04:28 +0000 (UTC) (envelope-from andrew@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id 04JG4RF7002453; Tue, 19 May 2020 16:04:27 GMT (envelope-from andrew@FreeBSD.org) Received: (from andrew@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id 04JG4RvR002450; Tue, 19 May 2020 16:04:27 GMT (envelope-from andrew@FreeBSD.org) Message-Id: <202005191604.04JG4RvR002450@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: andrew set sender to andrew@FreeBSD.org using -f From: Andrew Turner Date: Tue, 19 May 2020 16:04:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: svn commit: r361259 - in head/sys/arm64: arm64 include X-SVN-Group: head X-SVN-Commit-Author: andrew X-SVN-Commit-Paths: in head/sys/arm64: arm64 include X-SVN-Commit-Revision: 361259 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-head@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: SVN commit messages for the src tree for head/-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 May 2020 16:04:28 -0000 Author: andrew Date: Tue May 19 16:04:27 2020 New Revision: 361259 URL: https://svnweb.freebsd.org/changeset/base/361259 Log: Stop performing a full icache sync when the DIC and IDC flags are set The DIC and IDC bits in the CTR_EL0 register signal to the kernel when it can relax the instruction cache synchronisation operations. The IDC bit means we can relax cleaning the data cache to the point of unification while the DIC bit means we don't need to invalidate the instruction cache for data coherence. In both cases an appropriate barrier is still needed. For now only implement the case where both bits are set, as is the case on the Neoverse-N1 as used in the Amazon AWS Graviton 2 CPU. Note that this behaviour is a optional on the N1 so we may later need to implement only one or the other bit being set. There is a tunable to disable each flag on boot. Testing on a 4 core Graviton 2 instance found a significant improvement in sys and real time when running "make buildkernel -j4", with no significant difference in user time. Reviewed by: markj Sponsored by: Innovate UK Differential Revision: https://reviews.freebsd.org/D24853 Modified: head/sys/arm64/arm64/cpufunc_asm.S head/sys/arm64/arm64/identcpu.c head/sys/arm64/include/cpufunc.h Modified: head/sys/arm64/arm64/cpufunc_asm.S ============================================================================== --- head/sys/arm64/arm64/cpufunc_asm.S Tue May 19 15:27:20 2020 (r361258) +++ head/sys/arm64/arm64/cpufunc_asm.S Tue May 19 16:04:27 2020 (r361259) @@ -133,9 +133,20 @@ ENTRY(arm64_dcache_inv_range) END(arm64_dcache_inv_range) /* - * void arm64_icache_sync_range(vm_offset_t, vm_size_t) + * void arm64_dic_idc_icache_sync_range(vm_offset_t, vm_size_t) + * When the CTR_EL0.IDC bit is set cleaning to PoU becomes a dsb. + * When the CTR_EL0.DIC bit is set icache invalidation becomes an isb. */ -ENTRY(arm64_icache_sync_range) +ENTRY(arm64_dic_idc_icache_sync_range) + dsb ishst + isb + ret +END(arm64_dic_idc_icache_sync_range) + +/* + * void arm64_aliasing_icache_sync_range(vm_offset_t, vm_size_t) + */ +ENTRY(arm64_aliasing_icache_sync_range) /* * XXX Temporary solution - I-cache flush should be range based for * PIPT cache or IALLUIS for VIVT or VIPT caches @@ -146,7 +157,7 @@ ENTRY(arm64_icache_sync_range) dsb ish isb ret -END(arm64_icache_sync_range) +END(arm64_aliasing_icache_sync_range) /* * int arm64_icache_sync_range_checked(vm_offset_t, vm_size_t) Modified: head/sys/arm64/arm64/identcpu.c ============================================================================== --- head/sys/arm64/arm64/identcpu.c Tue May 19 15:27:20 2020 (r361258) +++ head/sys/arm64/arm64/identcpu.c Tue May 19 16:04:27 2020 (r361259) @@ -56,6 +56,24 @@ char machine[] = "arm64"; extern int adaptive_machine_arch; #endif +static SYSCTL_NODE(_machdep, OID_AUTO, cache, CTLFLAG_RD | CTLFLAG_MPSAFE, 0, + "Cache management tuning"); + +static int allow_dic = 1; +SYSCTL_INT(_machdep_cache, OID_AUTO, allow_dic, CTLFLAG_RDTUN, &allow_dic, 0, + "Allow optimizations based on the DIC cache bit"); + +static int allow_idc = 1; +SYSCTL_INT(_machdep_cache, OID_AUTO, allow_idc, CTLFLAG_RDTUN, &allow_idc, 0, + "Allow optimizations based on the IDC cache bit"); + +/* + * The default implementation of I-cache sync assumes we have an + * aliasing cache until we know otherwise. + */ +void (*arm64_icache_sync_range)(vm_offset_t, vm_size_t) = + &arm64_aliasing_icache_sync_range; + static int sysctl_hw_machine(SYSCTL_HANDLER_ARGS) { @@ -977,6 +995,7 @@ identify_cpu_sysinit(void *dummy __unused) { int cpu; u_long hwcap; + bool dic, idc; /* Create a user visible cpu description with safe values */ memset(&user_cpu_desc, 0, sizeof(user_cpu_desc)); @@ -985,6 +1004,8 @@ identify_cpu_sysinit(void *dummy __unused) ID_AA64PFR0_FP_NONE | ID_AA64PFR0_EL1_64 | ID_AA64PFR0_EL0_64; user_cpu_desc.id_aa64dfr0 = ID_AA64DFR0_DebugVer_8; + dic = (allow_dic != 0); + idc = (allow_idc != 0); CPU_FOREACH(cpu) { print_cpu_features(cpu); hwcap = parse_cpu_features_hwcap(cpu); @@ -993,6 +1014,17 @@ identify_cpu_sysinit(void *dummy __unused) else elf_hwcap &= hwcap; update_user_regs(cpu); + + if (CTR_DIC_VAL(cpu_desc[cpu].ctr) == 0) + dic = false; + if (CTR_IDC_VAL(cpu_desc[cpu].ctr) == 0) + idc = false; + } + + if (dic && idc) { + arm64_icache_sync_range = &arm64_dic_idc_icache_sync_range; + if (bootverbose) + printf("Enabling DIC & IDC ICache sync\n"); } if ((elf_hwcap & HWCAP_ATOMICS) != 0) { Modified: head/sys/arm64/include/cpufunc.h ============================================================================== --- head/sys/arm64/include/cpufunc.h Tue May 19 15:27:20 2020 (r361258) +++ head/sys/arm64/include/cpufunc.h Tue May 19 16:04:27 2020 (r361259) @@ -216,12 +216,15 @@ extern int64_t dczva_line_size; #define cpu_dcache_inv_range(a, s) arm64_dcache_inv_range((a), (s)) #define cpu_dcache_wb_range(a, s) arm64_dcache_wb_range((a), (s)) +extern void (*arm64_icache_sync_range)(vm_offset_t, vm_size_t); + #define cpu_icache_sync_range(a, s) arm64_icache_sync_range((a), (s)) #define cpu_icache_sync_range_checked(a, s) arm64_icache_sync_range_checked((a), (s)) void arm64_nullop(void); void arm64_tlb_flushID(void); -void arm64_icache_sync_range(vm_offset_t, vm_size_t); +void arm64_dic_idc_icache_sync_range(vm_offset_t, vm_size_t); +void arm64_aliasing_icache_sync_range(vm_offset_t, vm_size_t); int arm64_icache_sync_range_checked(vm_offset_t, vm_size_t); void arm64_dcache_wbinv_range(vm_offset_t, vm_size_t); void arm64_dcache_inv_range(vm_offset_t, vm_size_t);