Date: Sun, 10 Sep 2023 18:53:46 +0200 From: Robert Clausecker <fuz@fuz.su> To: Warner Losh <imp@bsdimp.com> Cc: current@freebsd.org Subject: Re: sed in CURRENT fails in textproc/jq Message-ID: <ZP30mmdQZulWVUh6@fuz.su> In-Reply-To: <CANCZdfoCBXxNzfTQ03bHwgOkRMoZAAkj_yxZ58i82e0chMWEog@mail.gmail.com> References: <30a59ac0-28fd-f6ed-505c-9ef6d8e84fc3@tsoft.com> <CAALwa8kAaDw80h2cbaeLOmw2G9EpRo6dJFZcFFc99mAU_7xkRA@mail.gmail.com> <86edj7qnia.fsf@ltc.des.no> <ZP1LPZcC99zIS1Jp@fuz.su> <CANCZdfpesLu%2B5YLB4-irR7XSSSM1wXPLH58TbefPzvGo=q1xgg@mail.gmail.com> <ZP3GWo3ZXahC8_2_@fuz.su> <CANCZdfoCBXxNzfTQ03bHwgOkRMoZAAkj_yxZ58i82e0chMWEog@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Warner, Thank you for your response. Am Sun, Sep 10, 2023 at 09:53:03AM -0600 schrieb Warner Losh: > On Sun, Sep 10, 2023, 7:36 AM Robert Clausecker <fuz@fuz.su> wrote: > > > Hi Warner, > > > > I have pushed a fix. It should hopefully address those failing tests. > > The same issue should also affect memcmp(), but unlike for memchr(), it is > > illegal to pass a length to memcmp() that extends past the actual end of > > the buffer as memcmp() is permitted to examine the whole buffer regardless > > of where the first mismatch is. > > > > I am considering a change to improve the behaviour of memcmp() on such > > errorneous inputs. There are two options: (a) I could change memcmp() the > > same way I fixed memchr() and have implausible buffer lengths behave as if > > the buffer goes to the end of the address space or (b) I could change > > memcmp() to crash loudly if it detects such a case. I could also > > (c) leave memcmp() as is. Which of these three choices is preferable? > > > > What does the standard say? I'm highly skeptical that these corner cases are > UB behavior. > > I'd like actual support for this statement, rather than your conjecture > that it's > illegal. Even if you can come up with that, preserving the old behavior is > my > first choice. Especially since many of these functions aren't well defined > by > a standard, but are extensions. > > As for memchr, > https://pubs.opengroup.org/onlinepubs/009696799/functions/memchr.html > has no such permission to examine 'the entire buffer at once' nor any > restirction > as to the length extending beyond the address space. I'm skeptical of your > reading > that it allows one to examine all of [b, b + len), so please explain where > the standard > supports reading past the first occurance. memchr() in particular is specified to only examine the input until the matching character is found (ISO/IEC 9899:2011 § 7.24.5.1): *** The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found. *** Therefore, it appears reasonable that calls with fake buffer lengths (e.g. SIZE_MAX, to read until a mismatch occurs) must be supported. However, memcmp() has no such language and the text explicitly states that the whole buffer is compared (ISO/IEC 9899:2011 § 7.24.4.1): *** The memcmp function compares the first n characters of the object pointed to by s1 to the first n characters of the object pointed to by s2. *** By omission, this seems to give license to e.g. implement memcmp() like timingsafe_memcmp() where it inspects all n characters of both buffers and only then gives a result. So if n is longer than the actual buffer (e.g. n == SIZE_MAX), behaviour may not be defined (e.g. there could be a crash due to crossing into an unmapped page). Thus I have patched memchr() to behave correctly when length SIZE_MAX is given (commit b2618b65). My memcmp() suffers from similarly flawed logic and may need to be patched. However, as the language I cited above does not indicate that such usage needs to be supported for memcmp() (whereas it must be for memchr(), contrary to my assumptions), I was asking you for how to proceed with memcmp (hence choices (a)--(c)). > Also, all the tests that started failing with your commit, available here, > > https://ci.freebsd.org/job/FreeBSD-main-amd64-test/24136/ > > should be fixed at a bare minimum. Yes, there's a lot of other tests > that are failing: those should be fixed or disabled also. Have these > tests been fixed? They are showing up as failed still in the latest run > > https://ci.freebsd.org/job/FreeBSD-main-amd64-test/24141/ > > shows them still failing by my reading. This run crashed kyua due to an issue that seems to be an instance of bug #273481 and no final report was generated. However, it seems like the unit tests in question do not fail anymore. Let's wait for build 24142. > > Also, we're several weeks in, and we're still fixing basic, fundamental > functions that show breakage. This indicates to me that proper care > hasn't been taken to replace the critical functions in the tree. It also > indicates to me that a stronger level of rigor is needed given that the > problems are trickling in every few days to weeks. This suggests that > the unusual step of backing all this work out of stable/14 and releng/14.0 > would be indicated until we can go through this process to the point > that the new bugs have stopped (which given the decay rate seen so > far suggests we won't be done before 14.0 is released). i know this is > an enormous pain, but I think the track record so far supports backing > these out of the release we're about to do. They just arrived too late > for the normal 'cooking' process of FreeBSD to sufficiently cook them > in time for 14.0. I understand your concerns and will prepare commits to back out the SIMD work from releng/14.0 and stable/14. I apologise for my insufficient testing. I hope I can improve my testing for upcoming changes. > Warner > > Yours, > > Robert Clausecker > > > > Am Sun, Sep 10, 2023 at 05:51:43AM -0600 schrieb Warner Losh: > > > On Sat, Sep 9, 2023, 10:51 PM Robert Clausecker <fuz@fuz.su> wrote: > > > > > > > Greetings, > > > > > > > > I apologise for the inconvenience. The issue seems to boil down to > > > > various places calling > > > > > > > > memchr(buf, c, SIZE_MAX); > > > > > > > > which causes an overflow when my newly written memchr() computes buf + > > > > len to find the end of the buffer. A patch to alleviate this issue can > > > > be found here: > > > > > > > > > > > > > > http://fuz.su/~fuz/freebsd/0001-lib-libc-amd64-string-memchr.S-fix-behaviour-with-ov.patch > > > > > > > > Please check if it does the trick for you. If yes, I'll go ahead and > > > > push it tomorrow-ish. > > > > > > > > > > > > > There are half a dozen or do kyua tests that are likely failing because > > of > > > this or other reasons related to strings. When you push this fix you'll > > > get the list. They are hidden among about 80 or so networking tests that > > > fail. I plan on disabling those tests soon If no one fixes them. > > > > > > Warner > > > > > > > > > > > Yours, > > > > Robert Clausecker > > > > > > > > Am Sat, Sep 09, 2023 at 07:12:29PM +0200 schrieb Dag-Erling Smørgrav: > > > > > Antoine Brodin <antoine@freebsd.org> writes: > > > > > > Yuri <yuri@freebsd.org> writes: > > > > > > > Either something has changed in sed(1) in CURRENT, or sed just > > fails > > > > > > > during the configure stage of textproc/jq: > > > > > > > > > > > > > > sed: No error: 0 > > > > > > > checking for sys/cygwin.h... eval: ${+...}: Bad substitution > > > > > > This seems to be a recent issue (less than 5 days). > > > > > > Hundreds of configure scripts now fail to run on 15-current due to > > > > > > this sed failure: [...] > > > > > > > > > > Try adding ARCHLEVEL=scalar to CONFIGURE_ENV on one of these. If > > that > > > > > helps, yell at fuz@ :) > > > > > > > > > > DES > > > > > -- > > > > > Dag-Erling Smørgrav - des@FreeBSD.org > > > > > > > > > > > > > -- > > > > () ascii ribbon campaign - for an 8-bit clean world > > > > /\ - against html email - against proprietary attachments > > > > > > > > > > > > -- > > () ascii ribbon campaign - for an 8-bit clean world > > /\ - against html email - against proprietary attachments > > -- () ascii ribbon campaign - for an 8-bit clean world /\ - against html email - against proprietary attachments
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?ZP30mmdQZulWVUh6>