Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 11 Sep 2023 08:26:16 +0200
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        Robert Clausecker <fuz@fuz.su>
Cc:        Warner Losh <imp@bsdimp.com>, current@freebsd.org
Subject:   Re: sed in CURRENT fails in textproc/jq
Message-ID:  <100f27b9bf542630377af4eae5e8c03c@Leidinger.net>
In-Reply-To: <ZP30mmdQZulWVUh6@fuz.su>
References:  <30a59ac0-28fd-f6ed-505c-9ef6d8e84fc3@tsoft.com> <CAALwa8kAaDw80h2cbaeLOmw2G9EpRo6dJFZcFFc99mAU_7xkRA@mail.gmail.com> <86edj7qnia.fsf@ltc.des.no> <ZP1LPZcC99zIS1Jp@fuz.su> <CANCZdfpesLu%2B5YLB4-irR7XSSSM1wXPLH58TbefPzvGo=q1xgg@mail.gmail.com> <ZP3GWo3ZXahC8_2_@fuz.su> <CANCZdfoCBXxNzfTQ03bHwgOkRMoZAAkj_yxZ58i82e0chMWEog@mail.gmail.com> <ZP30mmdQZulWVUh6@fuz.su>

next in thread | previous in thread | raw e-mail | index | archive | help
This is an OpenPGP/MIME signed message (RFC 4880 and 3156)

--=_373938dfb81f1ff1d5af74c34b9d4759
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset=UTF-8;
 format=flowed

Am 2023-09-10 18:53, schrieb Robert Clausecker:
> Hi Warner,
> 
> Thank you for your response.
> 
> Am Sun, Sep 10, 2023 at 09:53:03AM -0600 schrieb Warner Losh:
>> On Sun, Sep 10, 2023, 7:36 AM Robert Clausecker <fuz@fuz.su> wrote:
>> 
>> > Hi Warner,
>> >
>> > I have pushed a fix.  It should hopefully address those failing tests.
>> > The same issue should also affect memcmp(), but unlike for memchr(), it is
>> > illegal to pass a length to memcmp() that extends past the actual end of
>> > the buffer as memcmp() is permitted to examine the whole buffer regardless
>> > of where the first mismatch is.
>> >
>> > I am considering a change to improve the behaviour of memcmp() on such
>> > errorneous inputs.  There are two options: (a) I could change memcmp() the
>> > same way I fixed memchr() and have implausible buffer lengths behave as if
>> > the buffer goes to the end of the address space or (b) I could change
>> > memcmp() to crash loudly if it detects such a case.  I could also
>> > (c) leave memcmp() as is.  Which of these three choices is preferable?
>> >
>> 
>> What does the standard say? I'm highly skeptical that these corner 
>> cases are
>> UB behavior.
>> 
>> I'd like actual support for this statement, rather than your 
>> conjecture
>> that it's
>> illegal. Even if you can come up with that, preserving the old 
>> behavior is
>> my
>> first choice. Especially since many of these functions aren't well 
>> defined
>> by
>> a standard, but are extensions.
>> 
>> As for memchr,
>> https://pubs.opengroup.org/onlinepubs/009696799/functions/memchr.html
>> has no such permission to examine 'the entire buffer at once' nor any
>> restirction
>> as to the length extending beyond the address space. I'm skeptical of 
>> your
>> reading
>> that it allows one to examine all of [b, b + len), so please explain 
>> where
>> the standard
>> supports reading past the first occurance.
> 
> memchr() in particular is specified to only examine the input until the
> matching character is found (ISO/IEC 9899:2011 § 7.24.5.1):
> 
> ***
> The memchr function locates the first occurrence of c (converted to an
> unsigned char) in the initial n characters (each interpreted as 
> unsigned
> char) of the object pointed to by s. The implementation shall behave as
> if it reads the characters sequentially and stops as soon as a matching
> character is found.
> ***
> 
> Therefore, it appears reasonable that calls with fake buffer lengths
> (e.g. SIZE_MAX, to read until a mismatch occurs) must be supported.
> However, memcmp() has no such language and the text explicitly states
> that the whole buffer is compared (ISO/IEC 9899:2011 § 7.24.4.1):
> 
> ***
> The memcmp function compares the first n characters of the object
> pointed to by s1 to the first n characters of the object pointed to by 
> s2.
> ***
> 
> By omission, this seems to give license to e.g. implement memcmp() like
> timingsafe_memcmp() where it inspects all n characters of both buffers
> and only then gives a result.  So if n is longer than the actual buffer
> (e.g. n == SIZE_MAX), behaviour may not be defined (e.g. there could be
> a crash due to crossing into an unmapped page).
> 
> Thus I have patched memchr() to behave correctly when length SIZE_MAX 
> is
> given (commit b2618b65).  My memcmp() suffers from similarly flawed
> logic and may need to be patched.  However, as the language I cited 
> above
> does not indicate that such usage needs to be supported for memcmp()
> (whereas it must be for memchr(), contrary to my assumptions), I was
> asking you for how to proceed with memcmp (hence choices (a)--(c)).

My 2ct:
What did the previous implementation of memcmp() do in this case?
  - If it was generous and behaved similar to the requirements of
    memchr(), POLA requires to have the same now too.
  - If it was crashing or silently going on (= lurking bugs in 3rd
    party code), we may have the possibility to do a coredump in case
    of running past the end of the buffer to prevent malicous use.
  - In general I go with the robustness principle, "be liberal what you
    accept, but strict in what you provide" = memcmp() should behave
    as if it is supported.

Bye,
Alexander.

-- 
http://www.Leidinger.net Alexander@Leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.org    netchild@FreeBSD.org  : PGP 0x8F31830F9F2772BF

--=_373938dfb81f1ff1d5af74c34b9d4759
Content-Type: application/pgp-signature;
 name=signature.asc
Content-Disposition: attachment;
 filename=signature.asc;
 size=833
Content-Description: OpenPGP digital signature

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEER9UlYXp1PSd08nWXEg2wmwP42IYFAmT+sxcACgkQEg2wmwP4
2Ib0rRAAgxhAKIefJK9TIYgnC0sIoCH9WJePhu5fV7NIP/dBPrqSrZGZcNpznDQN
YSgF6iQ1JNhaKRlKUE5BKdigR2J4ZB/jNbMabloaa2EoMzi6+z6JuQNe7Q2Zxikj
2JWMVSktzWYIsQoLcr0TWIvXTkd0oJ9i28xDyrAFSgPXBwqMNl4miFq5/m5ZfF8q
Co56E5y+xL66wXv2Htnltc22Gv959ajD+bVyTOgTMGbLIq8nC//LTyMQDV1LbjGR
gMcRYeqMqcHUV+CB1FQBfGS/oOopxhJ9EbDdJvfnS7SSbFtkXxRntKuLFS+KOqi/
FRcmMFvI7FxSNebTa8l2jBADTpbV5q3B8CezqYggLmmCa8M9wo9iZJgeFSLsQf+F
QrppvHLASMV55KQUVPuQJt02fqM1XKl8bHKcAznnxGo46tVSiYomHAzHAfDqKGaF
FoQI1ok0FH8o7TOG5dcJW73CXL8mBrHoaBT/vjC+cCYslOcE4zD2ao9fpzYcfoL4
KEyORt9XJoRAA7OBtvUILHRGN0CgRrG9wDDb22i/Romvhg2mFK1iIT9kA1LyGeZR
3sfbqTVNdXqmeGf1uU6/YMUTefiPYmV7Uj1FFmOLwsI4LM/njNitslZB+ddkFPHZ
zUdGl1sicdD+t6Cj7+Jr5/PE43coFpB3ca1kdnZVlayOtc4u+9A=
=S4xN
-----END PGP SIGNATURE-----

--=_373938dfb81f1ff1d5af74c34b9d4759--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?100f27b9bf542630377af4eae5e8c03c>