Date: Wed, 18 Dec 2024 09:12:36 -0800 From: Gleb Smirnoff <glebius@freebsd.org> To: Ed Maste <emaste@freebsd.org> Cc: John Baldwin <jhb@freebsd.org>, src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: a1097094c4c5 - main - newvers: Set explicit git revision length Message-ID: <Z2MChH8931gQACQ7@cell.glebi.us> In-Reply-To: <CAPyFy2CxNkcA93P-3q-WSNiCXv4DxaBx6YP1p1s=VAhOaaKGMw@mail.gmail.com> References: <202412131306.4BDD6bxu011253@gitrepo.freebsd.org> <e827f951-e747-45d6-b4d8-a74a18734bae@FreeBSD.org> <CAPyFy2BC3Nn%2B7t3kNqhpjUJbdFG3SV4EErs0xS9kR0ufOiQ3XA@mail.gmail.com> <e9cf66e6-e43a-4ee4-a622-b1c5e0c1aa75@FreeBSD.org> <CAPyFy2CxNkcA93P-3q-WSNiCXv4DxaBx6YP1p1s=VAhOaaKGMw@mail.gmail.com>
next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Dec 18, 2024 at 10:22:24AM -0500, Ed Maste wrote: E> That said, it doesn't matter what Git's algorithm chooses as the short E> hash length; specifying --short bypasses that algorithm. `git E> rev-parse --verify --short=12 HEAD` will give us a 12-character short E> hash as long as that hash is unique. The reproducibility concern is E> thus: what is the probability that the 12-character short hash is E> unique at the time and in a repo from which an image is built, but is E> not unique for the attempt to reproduce it, or vice-versa. This E> probability is rather small. E> E> If you look at arbitrary commits 6 or 7 characters are usually E> sufficient for a unique hash today. For instance, some latest -pX from E> recent releng/ branches: E> E> 13.3: 72aa3d E> 13.4: 3f40d5 E> 14.0: f10e32 E> 14.1: 74b6c98 E> 14.2: c8918d6 E> E> The status quo of --short=12 should be fine for quite some time. AFAIU John's concern is that you can't guarantee a reproducible build from a "dirty" repository. A repository that has more branches than just the official ones. I just make a quick check on Netflix repo, that has both the current FreeBSD history and the before-the-official-git history together, as well as splitted ports subdirectories and of course our own stuff. For short hashes there are roughly 2x more ambiguities than for a "clean" repo. Apparently chance of collision on a long hash is also doubled. We can of course say that we don't provide reproducible builds from a "dirty" repo. But would be a real limitation. That would cancel a legitimate scenario: git subtree add FreeBSD && cd FreeBSD && make a reproducible build -- Gleb Smirnoff
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Z2MChH8931gQACQ7>