Date: Tue, 25 Jul 2017 07:05:58 +0000 From: bugzilla-noreply@freebsd.org To: freebsd-bugs@FreeBSD.org Subject: [Bug 219399] System panics after several hours of 14-threads-compilation orgies using poudriere on AMD Ryzen... Message-ID: <bug-219399-8-RmIDm26swj@https.bugs.freebsd.org/bugzilla/> In-Reply-To: <bug-219399-8@https.bugs.freebsd.org/bugzilla/> References: <bug-219399-8@https.bugs.freebsd.org/bugzilla/>
next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D219399 --- Comment #119 from Don Lewis <truckman@FreeBSD.org> --- (In reply to Nils Beyer from comment #115) I have some suspicions about what might be going wrong with ryzen_segv_test, but I really don't understand the memory fence / serialization stuff well enough to be sure. An experiment that I performed to try to get a better idea of where things might be going off the rails was to add this code: if (func_set->func[func_set->offset] !=3D 0x8b) { fprintf(stderr, "First opcode should be 0x8b, but f= ound 0x%x\n", func_set->func[func_set->offset]); } to thread1() in between the pf =3D (func_t)(&func_set->func[ func_set->offset ]); and ret2 =3D pf(func_set); to verify that the expected opcode was actually where we plan to jump to. = What was interesting is that the error never triggered, *but* the frequency of segfaults went way down. That led me to look at what the mfence instruction actually does: Acts as a barrier to force strong memory ordering (serialization) betwe= en load and store instructions preceding the MFENCE, and load and store instructions that follow the MFENCE. A weakly-ordered memory system allows the hardware to reorder reads and writes between the processor and memory. The MFENCE instruction guarantees that the system completes all previous memory accesses before executing subsequent accesses. The MFENCE instruction is weakly-ordered with respect to data and instruction prefetches. Note the last sentence! The mfence() at the end of the threadx() loop should flush all the pending writes out to cache associated with core A before the lock is unlocked and thread1() is permitted to do its work. It looks to me like the mfence() at the top of the thread1() loop would not= do anything since there aren't any interesting loads or stores that might be outstanding for core B at that point. One would think that serialize() aka= the cpuid instruction would prevent instruction prefetching of old stale data before that point, but maybe not. I just don't understand this well enough= ... Next I tried moving mfence(); serialize(); from just after lock_enter*() to just before the call into the newly moved data array: ret2 =3D pf(func_set); This gives mfence() some memory loads to wait for, which allows the data to= be migrated from the core A cache. With this change, I no longer get any segfaults. Ryzen bug? Just more aggressive prefetching? I don't know ... --=20 You are receiving this mail because: You are the assignee for the bug.=
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-219399-8-RmIDm26swj>