Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 3 Jul 2023 00:11:03 -0700
From:      Mark Millard <marklmi@yahoo.com>
To:        khng300@gmail.com, dev-commits-src-main@freebsd.org
Subject:   Re: git: 005aa1743b42 - main - modules: bzero the modspecific_t
Message-ID:  <4729575A-9D5A-462B-9A35-E8B274601215@yahoo.com>
In-Reply-To: <B5F5B65E-5795-4552-A674-244E1E7E5240@yahoo.com>
References:  <B5F5B65E-5795-4552-A674-244E1E7E5240@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Jul 2, 2023, at 09:34, Mark Millard <marklmi@yahoo.com> wrote:

> Ka Ho Ng <khng300_at_gmail.com> wrote on
> Date: Sun, 02 Jul 2023 10:18:35 UTC :
>=20
> On Sun, Jul 2, 2023 at 6:03=E2=80=AFAM Ka Ho Ng <khng300@gmail.com> =
wrote:
>>=20
>>> On Sat, Jul 1, 2023 at 7:13=E2=80=AFPM Konstantin Belousov =
<kostikbel@gmail.com>
>>> wrote:
>>>=20
>>>> On Sat, Jul 01, 2023 at 10:59:22PM +0000, Ka Ho Ng wrote:
>>>>> The branch main has been updated by khng:
>>>>>=20
>>>>> URL:
>>>> =
https://cgit.FreeBSD.org/src/commit/?id=3D005aa1743b42b52fbd49b9d5ec448169=
02b6ee9f
>>>>>=20
>>>>> commit 005aa1743b42b52fbd49b9d5ec44816902b6ee9f
>>>>> Author: Ka Ho Ng <khng@FreeBSD.org>
>>>>> AuthorDate: 2023-07-01 19:41:53 +0000
>>>>> Commit: Ka Ho Ng <khng@FreeBSD.org>
>>>>> CommitDate: 2023-07-01 22:58:46 +0000
>>>>>=20
>>>>> modules: bzero the modspecific_t
>>>>>=20
>>>>> Per https://reviews.llvm.org/D68115, only the first field is
>>>>> zero-initialized, meanwhile other fields are undef.
>>>> This is not true.
>>>>=20
>>>>>=20
>>>>> The pattern can be observed on clang as well, that when
>>>>> -ftrivial-auto-var-init=3Dpattern is specified 0xaa is filled for
>>>>> non-active fields, otherwise they are zero-initialized.
>>>> What are non-active fields?
>>>> All fields with implicit initializers
>>>> "shall be initialized implicitly the same as
>>>> objects that have static storage duration."
>>>>=20
>>>> I do not think this is required for padding.
>>>>=20
>>> In that case, modspecific_t ms did become 0xaaaaaaaa00000000 on =
amd64 if
>>> -ftrivial-auto-var-init=3Dpattern was specified.
>>>=20
>>>=20
>>>>=20
>>>>> Technically both are acceptable when using clang. However it
>>>>> would be good to simply bzero the modspecific_t in such case to
>>>>> be strict to the standard.
>>>>>=20
>>>>> MFC with: 2cab2d43b83b
>>>>> MFC after: 1 day
>>>> Min MFC period is 3 days.
>>>>=20
>>>>> Sponsored by: Juniper Networks, Inc.
>>>>> Reviewed by: delphij
>>>>> Differential Revision: https://reviews.freebsd.org/D40830
>>>>> ---
>>>>> sys/kern/kern_syscalls.c | 3 ++-
>>>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>=20
>>>>> diff --git a/sys/kern/kern_syscalls.c b/sys/kern/kern_syscalls.c
>>>>> index 234e51cfd280..0b51edf5e985 100644
>>>>> --- a/sys/kern/kern_syscalls.c
>>>>> +++ b/sys/kern/kern_syscalls.c
>>>>> @@ -173,9 +173,10 @@ kern_syscall_module_handler(struct sysent
>>>> *sysents, struct module *mod,
>>>>> int what, void *arg)
>>>>> {
>>>>> struct syscall_module_data *data =3D arg;
>>>>> - modspecific_t ms =3D { 0 };
>>>>> + modspecific_t ms;
>>>>> int error;
>>>>>=20
>>>>> + bzero(&ms, sizeof(ms));
>>>>> switch (what) {
>>>>> case MOD_LOAD:
>>>>> error =3D kern_syscall_register(sysents, data->offset,
>>>=20
>>>=20
>>> Ka Ho
>>>=20
>> Since I missed the reply-all button just now, I resend this email.
>>=20
>> Reading on N2716 one of the footnote stated:
>> "Note that aggregate type does not include union type because an =
object
>> with union type can only contain one member at a time"
>>=20
>> And below at 6.7.9.21 only aggregate was mentioned but not union, =
which
>> matched what I observed with an `cc -ftrivial-auto-var-init=3Dpattern`
>> invocation.
>=20
> The context for modspecific_t is (looking in my source
> tree for main):
>=20
> # grep -r modspecific_t /usr/main-src/sys/ | more
> /usr/main-src/sys/sys/module.h:} modspecific_t;
> /usr/main-src/sys/sys/module.h:void     module_setspecific(module_t, =
modspecific_t *);
> /usr/main-src/sys/sys/module.h: modspecific_t   data;
> /usr/main-src/sys/kern/kern_module.c:   modspecific_t           data;  =
 /* module specific data */
> /usr/main-src/sys/kern/kern_module.c:module_setspecific(module_t mod, =
modspecific_t *datap)
> /usr/main-src/sys/kern/kern_module.c:   modspecific_t   data;
> /usr/main-src/sys/kern/kern_module.c:   modspecific_t data;
> /usr/main-src/sys/kern/kern_syscalls.c: modspecific_t ms;
>=20
> This is C code, not C++ code. The languages are not fully
> matching in various details in some similar areas, especially
> when the vintages are not from a similar timeframe. Also,
> modspecific_t is a union:
>=20
> /*
> * A module can use this to report module specific data to the user via
> * kldstat(2).
> */
> typedef union modspecific {
>        int     intval;
>        u_int   uintval;
>        long    longval;
>        u_long  ulongval;
> } modspecific_t;
>=20
> When I look up C's N2716 in:
>=20
> https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm
>=20
> I find: "N2716 2021/05/09 Tydeman, Numerically equal" with the
> link https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2716.htm .
> Its summary line is:
>=20
> QUOTE
> Summary: The phrases "numerically equal" and "numerically equivalent" =
are used, but never defined. Also, they seem to add no value to just =
"equal".
> END QUOTE
>=20
> (There is also: "N2847 2021/10/15 Thomas, C23 proposal - Revised
> suggested change from N2716" with the link:
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2847.pdf .)
>=20
> This has nothing to do with what you are writing about.
>=20
> For C code, I suggest only using C langauge specification
> materials, not C++, unless the same source is to have
> dual language use for some reason. (I see no indication
> of such here. If such use is the intended context, please
> be explicit about that.)
>=20
> Also, using material from 2021 when FreeBSD is based on a
> older C standard version can have its own problems. If I
> remember right, FreeBSD is basically targeting C11 or
> so for its C code.
>=20
> An example quote from footnote 310 (for memcmp) in ISO/IEC
> 9899:2011 is:
>=20
> QUOTE
> The contents of "holes" used as padding for purposes of
> alignment within structure objects are indeterminate.
> Strings shorter than the allocated space and unions may
> also cause problems in comparison.
> END QUOTE
>=20
> Initializing .intval and then accessing .ulongval resulting
> in 0xaaaaaaaa00000000 as a possible result should be no
> surprise for C as far as I can tell. Based on what I'm
> reading, code that assumes otherwise is incorrect relative
> to the C language standard.
>=20
> It looks to me like the bzero use is trying to make changes
> to allow non-standard C code "still work".

I missed the relationship between the earlier:

git: 2cab2d43b83b - main - syscalls: fix modspecific_t stack content =
leak

and:

git: 005aa1743b42 - main - modules: bzero the modspecific_t

and so did not comment on the implications in that direction.

One could read my notes as indicating that the bzero is
inappropriate to the memory content leak. What I wrote
instead supports use of bzero (or some such) for avoiding
memory content leaks.

For:

typedef union modspecific {
       int     intval;
       u_int   uintval;
       long    longval;
       u_long  ulongval;
} modspecific_t;

Code like:

{
. . .
       modspecific_t ms =3D { 0 };
. . .

assigns to .intval and falls under what I quoted about
"holes" and such. Also, ISO/IEC 9899:2011's explicit
list of unspecified behavior in J.1 Unspecified behavior
reports:

-- The value of padding bytes when storing values in structures
   or unions (6.2.6.1)

-- The values of bytes that correspond to union members other
   than the one last stored into (6.2.6.1)

"modspecific_t ms =3D { 0 };" stores specifically into intval
under the language standard's rules, and there are bytes in
longval and ulongval that are not in intval on at least
some platforms for FreeBSD.

Thus the language standard indicates that assignment leaves
the values of various bytes in the union unspecified --and that
allows the leak of some stack content as a possibility. The
bzero avoids that.

Sorry if I caused any confusion with my to narrow of a focus.

=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4729575A-9D5A-462B-9A35-E8B274601215>