Date: Sun, 15 Nov 2009 19:41:18 +0000 From: "b. f." <bf1783@googlemail.com> To: Chris <christopher-ml@telting.org> Cc: freebsd-questions@FreeBSD.org Subject: Re: Produce identical packages for checksum comparison? Message-ID: <d873d5be0911151141r65f182axd48a2c767d2486d5@mail.gmail.com> In-Reply-To: <4B002741.4000403@telting.org> References: <d873d5be0911141823o40f16depea7f6dc5090801a3@mail.gmail.com> <4B002741.4000403@telting.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 11/15/09, Chris <christopher-ml@telting.org> wrote: > b. f. wrote: >> Chris wrote: ... >> Even if you edited your >> filesystem or archives to change the timestamps of package files, the >> > I think that could be accomplished though the port makefiles. I think that the exact reproduction of whole archives will be problematic, unless you have a means of changing the ctime of the binaries that have been built to a predetermined value. >> toolchain used to create the binary files in packages often injects >> random seeds, timestamps, file paths, uid/gid information, etc. that >> > I can understand file paths with debug info. Timestamps? Ok sure for a > timestamp file being generated during a make that auto increments version > numbers. What would change about uid/gid? I can't imagine why that > might be in the binaries. ar(1) and some of the other utilities inject this information into certain binary files. Try running 'objdump -a' on, for example, some static archive like /usr/lib/libc.a. Of course this information can be manipulated, but you have to do it. See the patches in the link I cited earlier for other examples. ... > Why would the build tools be injecting random numbers into binaries? Usually to provide some degree of uniqueness. I'm not saying that it is always done, just that it _may_ be done. See, for example, the gcc sources or the -frandom-seed option description in gcc(1). And it may not be just the compiler toolchain -- a port may do it. Occasionally, there are other sources of non-determinism. For example, in a recent thesis, a researcher who was trying to use reproducible builds to defeat a longstanding security threat found that the tcc compiler produced non-deterministic builds because of a defect in sign-extending some casts, and a problem with long double output. He also cited another researcher's finding that a certain java compiler's output was dependent upon the address of heap memory addresses used during compilation. See: http://www.dwheeler.com/trusting-trust/dissertation/wheeler-trusting-trust-ddc.pdf ... >If I concentrated on one problem at a time I would never get anything done. ?! :) b.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d873d5be0911151141r65f182axd48a2c767d2486d5>