Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Feb 2023 12:23:02 -0800
From:      Mark Millard <marklmi@yahoo.com>
To:        "Simon J. Gerraty" <sjg@juniper.net>
Cc:        Bryan Drewery <bdrewery@FreeBSD.org>, Current FreeBSD <freebsd-current@FreeBSD.org>, Peter <pmc@citylink.dinoex.sub.org>
Subject:   Re: FYI: Why META_MODE rebuilds so much for building again after installworld (no source changes) [code level bug evidence]
Message-ID:  <B2C8F37E-70E6-4C61-9232-50A2B5A548AF@yahoo.com>
In-Reply-To: <266ED18F-9249-46BB-BF96-1D4C5B46FCFC@yahoo.com>
References:  <B74790D9-FBC2-4818-BEAF-34E5B705C460@yahoo.com> <3345EBA5-A09C-4E3F-B94D-39F57F56BDBB@yahoo.com> <DB0C7B41-2101-4C5C-BFC8-3C95CC0B9F6F@yahoo.com> <73088.1611797582@kaos.jnpr.net> <CB7040D0-3BF4-496F-A54F-87E5378016E0@yahoo.com> <F6BF110D-7855-4A10-A53F-52B34282234F@yahoo.com> <10819.1677108389@kaos.jnpr.net> <76FA98EF-6184-4D7E-A01F-0EE8117D0D10@yahoo.com> <29887.1677115125@kaos.jnpr.net> <27790339-240F-4C97-97C7-38AFD8DE03D5@yahoo.com> <72419.1677133429@kaos.jnpr.net> <B11DA944-90E7-42D9-81A4-145686767305@yahoo.com> <266ED18F-9249-46BB-BF96-1D4C5B46FCFC@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Feb 23, 2023, at 11:53, Mark Millard <marklmi@yahoo.com> wrote:

> cached_realpath only reports its "cached_realpath:" notice
> (not the purging one) when it does not find the value via
> HashTable_FindValue and so does a HashTable_Set :
>=20
> const char *
> cached_realpath(const char *pathname, char *resolved)
> {
>        const char *rp;
>=20
>        if (pathname =3D=3D NULL || pathname[0] =3D=3D '\0')
>                return NULL;
>=20
>        rp =3D HashTable_FindValue(&cached_realpaths, pathname);
>        if (rp !=3D NULL) {
>                /* a hit */
>                strncpy(resolved, rp, MAXPATHLEN);
>                resolved[MAXPATHLEN - 1] =3D '\0';
>                return resolved;
>        }
>=20
>        rp =3D realpath(pathname, resolved);
>        if (rp !=3D NULL) {
>                HashTable_Set(&cached_realpaths, pathname, =
bmake_strdup(rp));
>                DEBUG2(DIR, "cached_realpath: %s -> %s\n", pathname, =
rp);
>                return resolved;
>        }
>=20
>        /* should we negative-cache? */
>        return NULL;
> }
>=20
> cached_realpaths is global:
>=20
> static HashTable cached_realpaths;
>=20
> So with -ddM why do I see lots of "cached_realpath:"
> notices for the same path? For example:
>=20
> # grep "tmp/legacy/usr/sbin/ln\>" =
/usr/obj/BUILDs/main-amd64-nodbg-clang/sys-typescripts/typescript-make-amd=
64-nodbg-clang-amd64-host-2023-02-23:10:20:26 | more
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/usr.bin/aw=
k/awkgram.tab.h.meta: 22: file =
'/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legac=
y/usr/sbin/ln' is newer than the target...
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> cached_realpath: =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln -> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/bin/ln
>   Caching 02:49:37 Feb 23, 2023 for =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legacy=
/usr/sbin/ln
> =
/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/usr.bin/aw=
k/awkgram.tab.h.meta: 22: file =
'/usr/obj/BUILDs/main-amd64-nodbg-clang/usr/main-src/amd64.amd64/tmp/legac=
y/usr/sbin/ln' is newer than the target...
> . . .
>=20
> A possible cause is something I ran into while looking around:
>=20
> /* A read-only range of a character array, NOT null-terminated. */
> typedef struct Substring {
>        const char *start;
>        const char *end;
> } Substring;
> . . .
> MAKE_STATIC Substring
> Substring_Init(const char *start, const char *end)
> {
>        Substring sub;
>=20
>        sub.start =3D start;
>        sub.end =3D end;
>        return sub;
> }
> . . .
> /* Find the entry corresponding to the key, or return NULL. */
> HashEntry *
> HashTable_FindEntry(HashTable *t, const char *key)
> {
>        const char *keyEnd;
>        unsigned int h =3D Hash_String(key, &keyEnd);
>        return HashTable_Find(t, Substring_Init(key, keyEnd), h);
> }
> . . .
> /* A read-only range of a character array, NOT null-terminated. */
> typedef struct Substring {
>        const char *start;
>        const char *end;
> } Substring;
> . . .
> MAKE_STATIC Substring
> Substring_Init(const char *start, const char *end)
> {
>        Substring sub;
>=20
>        sub.start =3D start;
>        sub.end =3D end;
>        return sub;
> }
> . . .
> /* Find the entry corresponding to the key, or return NULL. */
> HashEntry *
> HashTable_FindEntry(HashTable *t, const char *key)
> {
>        const char *keyEnd;
>        unsigned int h =3D Hash_String(key, &keyEnd);
>        return HashTable_Find(t, Substring_Init(key, keyEnd), h);
> }
> . . .
> /* This hash function matches Gosling's Emacs and java.lang.String. */
> static unsigned int
> Hash_String(const char *key, const char **out_keyEnd)
> {
>        unsigned int h;
>        const char *p;
>=20
>        h =3D 0;
>        for (p =3D key; *p !=3D '\0'; p++)
>                h =3D 31 * h + (unsigned char)*p;
>=20
>        *out_keyEnd =3D p;
>        return h;
> }
>=20
> But after the loop: *p=3D=3D'\0' so *out_keyEnd=3D=3D'\0'
> and the FindEntry Substring_Init(key, keyEnd) ends
> up including the '\0' byte.
>=20
> But note that the h in Hash_String did not include the
> '\0' byte. Call this h value h_VALUE0 for later reference.
> Then look at:
>=20
> /* This hash function matches Gosling's Emacs and java.lang.String. */
> unsigned int
> Hash_Substring(Substring key)
> {
>        unsigned int h;
>        const char *p;
>=20
>        h =3D 0;
>        for (p =3D key.start; p !=3D key.end; p++)
>                h =3D 31 * h + (unsigned char)*p;
>        return h;
> }
>=20
> This h does include the '\0' byte so h=3D=3D(unsigned =
int)(31*h_VALUE0).

Dumb mistake on my part. Actually *(key.end) is never used., even
if *(key.end) !=3D '\0' .

> I expect the mismatched hash values explain the repeated
> "cached_realpath:" notices for the same path: inserted
> but never found.

Still, the comments and code do not match and I've not
checked all usage for assumptions about *(key.end)
vs. '\0' .


=3D=3D=3D
Mark Millard
marklmi at yahoo.com




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?B2C8F37E-70E6-4C61-9232-50A2B5A548AF>