From nobody Tue Aug 8 13:46:12 2023 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4RKvbw2K1vz4mN9H for ; Tue, 8 Aug 2023 13:46:20 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Received: from www121.sakura.ne.jp (www121.sakura.ne.jp [153.125.133.21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4RKvbt2R93z3LZF for ; Tue, 8 Aug 2023 13:46:17 +0000 (UTC) (envelope-from junchoon@dec.sakura.ne.jp) Authentication-Results: mx1.freebsd.org; dkim=none; spf=none (mx1.freebsd.org: domain of junchoon@dec.sakura.ne.jp has no SPF policy when checking 153.125.133.21) smtp.mailfrom=junchoon@dec.sakura.ne.jp; dmarc=none Received: from kalamity.joker.local (123-1-88-210.area1b.commufa.jp [123.1.88.210]) (authenticated bits=0) by www121.sakura.ne.jp (8.16.1/8.16.1/[SAKURA-WEB]/20201212) with ESMTPA id 378DkC6d034302; Tue, 8 Aug 2023 22:46:13 +0900 (JST) (envelope-from junchoon@dec.sakura.ne.jp) Date: Tue, 8 Aug 2023 22:46:12 +0900 From: Tomoaki AOKI To: freebsd-current@freebsd.org Subject: Re: [Intel AlderLake] Read&Write files to FAT32 or UFS partition cause data corrupt due to P-Core&E-Core Message-Id: <20230808224612.c3889d6e20b6fc980f5278cc@dec.sakura.ne.jp> In-Reply-To: References: <7A743668-B5AA-4679-9F56-9A6220CBBC14@karels.net> <59cbcfe2-cd53-69d8-65d6-7a79e656f494@FreeBSD.org> <1f968af1-1c57-9a09-7e01-145a5262e27f@FreeBSD.org> <20230806181238.858f58e25dfd0f99269cfe53@dec.sakura.ne.jp> <20230808063735.e8e1d3ede370a18f200a6f48@dec.sakura.ne.jp> Organization: Junchoon corps X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; amd64-portbld-freebsd13.2) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spamd-Result: default: False [0.30 / 15.00]; AUTH_NA(1.00)[]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_LONG(-0.97)[-0.973]; NEURAL_SPAM_SHORT(0.78)[0.778]; MV_CASE(0.50)[]; MIME_GOOD(-0.10)[text/plain]; ONCE_RECEIVED(0.10)[]; ASN(0.00)[asn:7684, ipnet:153.125.128.0/18, country:JP]; RCVD_COUNT_ONE(0.00)[1]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; MLMMJ_DEST(0.00)[freebsd-current@freebsd.org]; R_DKIM_NA(0.00)[]; RCPT_COUNT_ONE(0.00)[1]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; ARC_NA(0.00)[]; HAS_ORG_HEADER(0.00)[]; R_SPF_NA(0.00)[no SPF record]; TO_DN_NONE(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DMARC_NA(0.00)[sakura.ne.jp]; RCVD_VIA_SMTP_AUTH(0.00)[] X-Spamd-Bar: / X-Rspamd-Queue-Id: 4RKvbt2R93z3LZF On Tue, 8 Aug 2023 15:38:46 +0300 Konstantin Belousov wrote: > On Tue, Aug 08, 2023 at 06:37:35AM +0900, Tomoaki AOKI wrote: > > On Sun, 6 Aug 2023 12:55:07 +0300 > > Konstantin Belousov wrote: > > > > > On Sun, Aug 06, 2023 at 06:12:38PM +0900, Tomoaki AOKI wrote: > > > > On Wed, 23 Feb 2022 01:30:28 +0200 > > > > Konstantin Belousov wrote: > > > > > > > > > On Tue, Feb 22, 2022 at 06:23:17PM -0500, Alexander Motin wrote: > > > > > > On 22.02.2022 17:46, Konstantin Belousov wrote: > > > > > > > Ok, the next step is to get the CPU feature reports from P- vs. E- cores. > > > > > > > Patch below should work, with verbose boot. > > > > > > > > > > > > Not much difference on that level: > > > > > > > > > > > > --- zzzp 2022-02-22 18:18:24.531704000 -0500 > > > > > > +++ zzze 2022-02-22 18:18:18.631236000 -0500 > > > > > > @@ -1,22 +1,21 @@ > > > > > > -CPU 2: 12th Gen Intel(R) Core(TM) i7-12700K (3609.60-MHz K8-class CPU) > > > > > > +CPU 16: 12th Gen Intel(R) Core(TM) i7-12700K (3609.60-MHz K8-class CPU) > > > > > > Origin="GenuineIntel" Id=0x90672 Family=0x6 Model=0x97 Stepping=2 > > > > > > Features=0xbfebfbff > > > > > > Features2=0x7ffafbff > > > > > > AMD Features=0x2c100800 > > > > > > AMD Features2=0x121 > > > > > > Structured Extended Features=0x239ca7eb > > > > > > Structured Extended Features2=0x98c027ac > > > > > > Structured Extended Features3=0xfc1cc410 > > > > > > XSAVE Features=0xf > > > > > > IA32_ARCH_CAPS=0xd6b > > > > > > VT-x: Basic Features=0x3da0500 > > > > > > Pin-Based Controls=0xff > > > > > > Primary Processor Controls=0xfffbfffe > > > > > > Secondary Processor Controls=0xf5d7fff > > > > > > Exit Controls=0x3da0500 > > > > > > Entry Controls=0x3da0500 > > > > > > EPT Features=0x6f34141 > > > > > > VPID Features=0x10f01 > > > > > > TSC: P-state invariant, performance statistics > > > > > > -64-Byte prefetching > > > > > > -L2 cache: 1280 kbytes, 8-way associative, 64 bytes/line > > > > > > +L2 cache: 2048 kbytes, 16-way associative, 64 bytes/line > > > > > > > > > > > > > > > > Show me the full verbose dmesg of the boot then. > > > > > > > > > > As another blind guess, try to disable pcid, vm.pmap.pcid_enabled=0. > > > > > > > > > > > > > Hi. > > > > > > > > Intel N100 is reported to crash without this tunable on 13.2 at > > > > freebsd-users-jp ML (as this is a ML in Japanese, reported in > > > > Japanese). [1] > > > > Crashes with UFS, but ZFS is claimed to be OK. > > > > > > > > N100 is an Alder Lake-N processor WITHOUT P-CORE. [2] [3] > > > > So check logics on workarouund codes (IIRC, all are MFC'ed before 13.2) > > > > wouldn't be working? > > > > > > Show me the output from x86info -r on the machine, I do not care which > > > specific core it is, they should be all the same. x86info is available > > > as sysutils/x86info. > > > > Requested to original reporter and got the result below. > > HTH. > > > > ----------------------- > > root@eq12:~ # x86info -r > > x86info v1.31pre > > /dev/cpuctl0: No such file or directory > > Found 4 identical CPUs > > Extended Family: 0 Extended Model: 11 Family: 6 Model: 190 Stepping: 0 > > Type: 0 (Original OEM) > > CPU Model (x86info's best guess): Unknown model. > ... > > eax in: 0x0000001a, eax = 20000001 ebx = 00000000 ecx = 00000000 edx = 00000000 > > The CPU is reported as small core/atom, so the workaround is turned on. > I do not think that the issue reported is related to the TLB/PG_G errata. > > Why do you think that this is hw issue at all, and not some software bug > in the build etc ? Because the issue looks similar (crashes on UFS but not ZFS, and as far as the original reporter tested, vm.pmap.pcid_enabled=0 in /boot/loader.conf helped). Moreover, N100 CPU is Alder Lake-N. So potentially includes the same design issue (common circuits, firmwares, ...). So I suspected the same problem persists even without P-core and adviced the original reporter to add the workaround in /boot/loader.conf. It seems to help until now. -- Tomoaki AOKI