Date: Fri, 19 Apr 2013 15:49:42 +0300 From: Konstantin Belousov <kostikbel@gmail.com> To: Carl Shapiro <carl.shapiro@gmail.com> Cc: FreeBSD Hackers <freebsd-hackers@freebsd.org> Subject: Re: MADV_FREE and wait4 EFAULT Message-ID: <20130419124942.GA67273@kib.kiev.ua> In-Reply-To: <CANVK_QgRBO5ZU=NHCr1XTvtxYpWk6LjWEv8Q-70mY6CzqHO2TA@mail.gmail.com> References: <CANVK_QgKRkpzWjA=H2u2HTp_vpxFhNLBGTVuFZmMEpBLTbzeaA@mail.gmail.com> <20130417082143.GW2930@kib.kiev.ua> <CANVK_QgRBO5ZU=NHCr1XTvtxYpWk6LjWEv8Q-70mY6CzqHO2TA@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
--82I3+IH0IqGh5yIs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Apr 18, 2013 at 02:51:43PM -0700, Carl Shapiro wrote: > On Wed, Apr 17, 2013 at 1:21 AM, Konstantin Belousov <kostikbel@gmail.com= >wrote: >=20 > > Did you ensured with e.g. ktrace and procstat -v that your assumptions > > hold, i.e. the addresses supplied as wait4(2) arguments are valid ? > > Please provide the minimal test case demonstrating the behaviour. > > >=20 > Yes. I instrumented my code to check for a wait4 failure, print the > addresses of the status and rusage arguments, and dump the contents of > /proc/curproc/map. The addresses of the status and rusage arguments are > always in the range of a mapping and marked as read write. It would be of some interest to see the evidence. Is your code multithreaded ? >=20 > I have yet to distill the failure to a minimal test case. The test case I > do have is the test harness for the Go language. After running for about > 45 minutes I can observe a failure. I have been working to produce > something smaller and faster. The test case is required to decide whether the bug is in the application or in the OS. >=20 >=20 > > MADV_FREE should only result in the possible lost of the previous > > content of the page, not in the faulting of the page access. From the > > inspection of the code, I do not see how MADV_FREE could result in > > the memory address becoming invalid. > > >=20 > I see. What has lead us to believe this might be an issue with page faul= ts > is that writing zeroes to the page with memset before passing it to wait4 > makes the error go away. There is no difference in the access performed by copyout vs. access caused by the usermode write. >=20 > Do you have any advice about how one might go about instrumenting wait4 to > generate more information about a failed copyout? Are tools such as dtra= ce > useful in these situations or might it be too invasive? Because of the > protracted test cycle and my lack of knowledge in this area, conducting > experiments is quite painful at the moment. No, I cannot give an advice, I think we should first decide which code to blame. BTW, you could try enabling sysctl machdep.uprintf_signal. Oh, you did not specified the architecture and version of the system. --82I3+IH0IqGh5yIs Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJRcT1lAAoJEJDCuSvBvK1B8DYP/00fJjKjqGegn+hv8HlyjlHY zhvyKcEyeHLOKUBWV9cNvOZOmo8TsPmW95vd78dBR9xoCCjfXz0YLDA3gulkHhWz zaNTzfn+BT8AyEmDu3/lthPchZwonLUeGlb5X0tnuQ8/beRNivMBr671ckn4rJZs rqtW0bpsBvBmvKN5L6aHEH8Rf9yQTh8VGR6DGdrX0LK7RhzQVLgeLtnbvDXWAD9p Rfw39LJWwVwNC/UbbhTlOfnPCf0O9kCMy9zdt2p2w+6k/Kql2XbwCzbhKSgf3c6l YYLr0y9Xw6HujixW/aaS4LKegnAX9y2L1oNTtdOdFKbjpLuNAWJ3W10eSmCjSeWa BYP+l7L1x6HkrHmPMibJTwC7ruJuzCoCWlOMQ2Aiw2TdpoE0ZNw+5mTl3tz8xm2d hHRfApRqMENanidOV3qvCptT0wYCsEd+bnqCqdHHazNFV4NzeMYOUTRoshpmNE02 bhBzCUdDuMX55fLytkjdvl35u78gUOJ+0ZSJ5wy+qgzQkocxJahnj9rlnqkXKjHz B4xmwGvUdq+mX1YGSCjGJkbbKkdPrn5sRfHr9cQuqVtP5tLP4ZlirHM5ulHHNE1a BArZYirqaFXHEAd7eoWcoC4KPQnCNcIEWH09qsiaUR7Rswor70VzHNW5ey1wbLkg 3Ch70R2bB8oyKU75j0mI =62ko -----END PGP SIGNATURE----- --82I3+IH0IqGh5yIs--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130419124942.GA67273>