Date: Tue, 13 Feb 2018 07:03:14 -0500 From: Mike Tancsa <mike@sentex.net> To: Konstantin Belousov <kib@freebsd.org>, Elliott.Rabe@dell.com Cc: alc@freebsd.org, freebsd-hackers@freebsd.org, markj@freebsd.org, Eric.Van.Gyzen@dell.com Subject: Re: Stale memory during post fork cow pmap update Message-ID: <51a330e1-10fa-e5cb-e8a9-c519680fdbcd@sentex.net> In-Reply-To: <20180210225608.GM33564@kib.kiev.ua> References: <5A7E7F2B.80900@dell.com> <20180210111848.GL33564@kib.kiev.ua> <5A7F6A7C.80607@dell.com> <20180210225608.GM33564@kib.kiev.ua>
next in thread | previous in thread | raw e-mail | index | archive | help
On 2/10/2018 5:56 PM, Konstantin Belousov wrote: > On Sat, Feb 10, 2018 at 09:56:20PM +0000, Elliott.Rabe@dell.com wrote: >> On 02/10/2018 05:18 AM, Konstantin Belousov wrote: >>> On Sat, Feb 10, 2018 at 05:12:11AM +0000, Elliott.Rabe@dell.com wrote: >>>> Greetings- >>>> >>>> I've been hunting for the root cause of elusive, slight memory >>>> corruptions in a large, complex process that manages many threads. All >>>> failures and experimentation thus far has been on x86_64 architecture >>>> machines, and pmap_pcid is not in use. >>>> The patch below seems to fix the issues I was seeing in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225584 at least I have not been able to reproduce it. It would normally take 2-3 builds of net/samba47 to manifest, but I was able to do 70 over night without fail. For some reason, this issue was far more acute on AMD Ryzen CPUs than any of the Intel CPUs I had been testing on. > So I agree that doing two-stage COW, with the first stage copying page > but keeping it read-only, is the right solution. Below is my take. > During the smoke boot, I noted that there is somewhat related issue in > reevaluation of the map entry permissions. > > diff --git a/sys/vm/vm_fault.c b/sys/vm/vm_fault.c > index 83e12a588ee..149a15f1d9d 100644 > --- a/sys/vm/vm_fault.c > +++ b/sys/vm/vm_fault.c > @@ -1135,6 +1157,10 @@ RetryFault:; > */ > pmap_copy_page(fs.m, fs.first_m); > fs.first_m->valid = VM_PAGE_BITS_ALL; > + if ((fault_flags & VM_FAULT_WIRE) == 0) { > + prot &= ~VM_PROT_WRITE; > + fault_type &= ~VM_PROT_WRITE; > + } > if (wired && (fault_flags & > VM_FAULT_WIRE) == 0) { > vm_page_lock(fs.first_m); > @@ -1219,6 +1245,12 @@ RetryFault:; > * write-enabled after all. > */ > prot &= retry_prot; > + fault_type &= retry_prot; > + if (prot == 0) { > + release_page(&fs); > + unlock_and_deallocate(&fs); > + goto RetryFault; > + } > } > } > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 x203 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?51a330e1-10fa-e5cb-e8a9-c519680fdbcd>