Date: Mon, 11 Sep 2023 08:26:16 +0200 From: Alexander Leidinger <Alexander@Leidinger.net> To: Robert Clausecker <fuz@fuz.su> Cc: Warner Losh <imp@bsdimp.com>, current@freebsd.org Subject: Re: sed in CURRENT fails in textproc/jq Message-ID: <100f27b9bf542630377af4eae5e8c03c@Leidinger.net> In-Reply-To: <ZP30mmdQZulWVUh6@fuz.su> References: <30a59ac0-28fd-f6ed-505c-9ef6d8e84fc3@tsoft.com> <CAALwa8kAaDw80h2cbaeLOmw2G9EpRo6dJFZcFFc99mAU_7xkRA@mail.gmail.com> <86edj7qnia.fsf@ltc.des.no> <ZP1LPZcC99zIS1Jp@fuz.su> <CANCZdfpesLu%2B5YLB4-irR7XSSSM1wXPLH58TbefPzvGo=q1xgg@mail.gmail.com> <ZP3GWo3ZXahC8_2_@fuz.su> <CANCZdfoCBXxNzfTQ03bHwgOkRMoZAAkj_yxZ58i82e0chMWEog@mail.gmail.com> <ZP30mmdQZulWVUh6@fuz.su>
next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --=_373938dfb81f1ff1d5af74c34b9d4759 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8; format=flowed Am 2023-09-10 18:53, schrieb Robert Clausecker: > Hi Warner, > > Thank you for your response. > > Am Sun, Sep 10, 2023 at 09:53:03AM -0600 schrieb Warner Losh: >> On Sun, Sep 10, 2023, 7:36 AM Robert Clausecker <fuz@fuz.su> wrote: >> >> > Hi Warner, >> > >> > I have pushed a fix. It should hopefully address those failing tests. >> > The same issue should also affect memcmp(), but unlike for memchr(), it is >> > illegal to pass a length to memcmp() that extends past the actual end of >> > the buffer as memcmp() is permitted to examine the whole buffer regardless >> > of where the first mismatch is. >> > >> > I am considering a change to improve the behaviour of memcmp() on such >> > errorneous inputs. There are two options: (a) I could change memcmp() the >> > same way I fixed memchr() and have implausible buffer lengths behave as if >> > the buffer goes to the end of the address space or (b) I could change >> > memcmp() to crash loudly if it detects such a case. I could also >> > (c) leave memcmp() as is. Which of these three choices is preferable? >> > >> >> What does the standard say? I'm highly skeptical that these corner >> cases are >> UB behavior. >> >> I'd like actual support for this statement, rather than your >> conjecture >> that it's >> illegal. Even if you can come up with that, preserving the old >> behavior is >> my >> first choice. Especially since many of these functions aren't well >> defined >> by >> a standard, but are extensions. >> >> As for memchr, >> https://pubs.opengroup.org/onlinepubs/009696799/functions/memchr.html >> has no such permission to examine 'the entire buffer at once' nor any >> restirction >> as to the length extending beyond the address space. I'm skeptical of >> your >> reading >> that it allows one to examine all of [b, b + len), so please explain >> where >> the standard >> supports reading past the first occurance. > > memchr() in particular is specified to only examine the input until the > matching character is found (ISO/IEC 9899:2011 § 7.24.5.1): > > *** > The memchr function locates the first occurrence of c (converted to an > unsigned char) in the initial n characters (each interpreted as > unsigned > char) of the object pointed to by s. The implementation shall behave as > if it reads the characters sequentially and stops as soon as a matching > character is found. > *** > > Therefore, it appears reasonable that calls with fake buffer lengths > (e.g. SIZE_MAX, to read until a mismatch occurs) must be supported. > However, memcmp() has no such language and the text explicitly states > that the whole buffer is compared (ISO/IEC 9899:2011 § 7.24.4.1): > > *** > The memcmp function compares the first n characters of the object > pointed to by s1 to the first n characters of the object pointed to by > s2. > *** > > By omission, this seems to give license to e.g. implement memcmp() like > timingsafe_memcmp() where it inspects all n characters of both buffers > and only then gives a result. So if n is longer than the actual buffer > (e.g. n == SIZE_MAX), behaviour may not be defined (e.g. there could be > a crash due to crossing into an unmapped page). > > Thus I have patched memchr() to behave correctly when length SIZE_MAX > is > given (commit b2618b65). My memcmp() suffers from similarly flawed > logic and may need to be patched. However, as the language I cited > above > does not indicate that such usage needs to be supported for memcmp() > (whereas it must be for memchr(), contrary to my assumptions), I was > asking you for how to proceed with memcmp (hence choices (a)--(c)). My 2ct: What did the previous implementation of memcmp() do in this case? - If it was generous and behaved similar to the requirements of memchr(), POLA requires to have the same now too. - If it was crashing or silently going on (= lurking bugs in 3rd party code), we may have the possibility to do a coredump in case of running past the end of the buffer to prevent malicous use. - In general I go with the robustness principle, "be liberal what you accept, but strict in what you provide" = memcmp() should behave as if it is supported. Bye, Alexander. -- http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF http://www.FreeBSD.org netchild@FreeBSD.org : PGP 0x8F31830F9F2772BF --=_373938dfb81f1ff1d5af74c34b9d4759 Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc; size=833 Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmT+sxcACgkQEg2wmwP4 2Ib0rRAAgxhAKIefJK9TIYgnC0sIoCH9WJePhu5fV7NIP/dBPrqSrZGZcNpznDQN YSgF6iQ1JNhaKRlKUE5BKdigR2J4ZB/jNbMabloaa2EoMzi6+z6JuQNe7Q2Zxikj 2JWMVSktzWYIsQoLcr0TWIvXTkd0oJ9i28xDyrAFSgPXBwqMNl4miFq5/m5ZfF8q Co56E5y+xL66wXv2Htnltc22Gv959ajD+bVyTOgTMGbLIq8nC//LTyMQDV1LbjGR gMcRYeqMqcHUV+CB1FQBfGS/oOopxhJ9EbDdJvfnS7SSbFtkXxRntKuLFS+KOqi/ FRcmMFvI7FxSNebTa8l2jBADTpbV5q3B8CezqYggLmmCa8M9wo9iZJgeFSLsQf+F QrppvHLASMV55KQUVPuQJt02fqM1XKl8bHKcAznnxGo46tVSiYomHAzHAfDqKGaF FoQI1ok0FH8o7TOG5dcJW73CXL8mBrHoaBT/vjC+cCYslOcE4zD2ao9fpzYcfoL4 KEyORt9XJoRAA7OBtvUILHRGN0CgRrG9wDDb22i/Romvhg2mFK1iIT9kA1LyGeZR 3sfbqTVNdXqmeGf1uU6/YMUTefiPYmV7Uj1FFmOLwsI4LM/njNitslZB+ddkFPHZ zUdGl1sicdD+t6Cj7+Jr5/PE43coFpB3ca1kdnZVlayOtc4u+9A= =S4xN -----END PGP SIGNATURE----- --=_373938dfb81f1ff1d5af74c34b9d4759--
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?100f27b9bf542630377af4eae5e8c03c>