From owner-freebsd-toolchain@FreeBSD.ORG Sun Nov 14 22:47:56 2010 Return-Path: Delivered-To: freebsd-toolchain@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 083A0106564A for ; Sun, 14 Nov 2010 22:47:56 +0000 (UTC) (envelope-from erik@cederstrand.dk) Received: from csmtp3.one.com (csmtp3.one.com [91.198.169.23]) by mx1.freebsd.org (Postfix) with ESMTP id 8BA538FC12 for ; Sun, 14 Nov 2010 22:47:55 +0000 (UTC) Received: from macfeast.lan (0x573b9942.cpe.ge-1-2-0-1101.ronqu1.customer.tele.dk [87.59.153.66]) by csmtp3.one.com (Postfix) with ESMTP id 2AFDD240632D for ; Sun, 14 Nov 2010 22:29:45 +0000 (UTC) From: Erik Cederstrand Content-Type: multipart/signed; boundary=Apple-Mail-2097-81456243; protocol="application/pkcs7-signature"; micalg=sha1 Date: Sun, 14 Nov 2010 23:29:45 +0100 Message-Id: To: freebsd-toolchain@freebsd.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Clang and -frandom-seed X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Nov 2010 22:47:56 -0000 --Apple-Mail-2097-81456243 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hello toolchainers, I noticed that two consecutive builds of (GCC-built) Clang don't produce = identical binaries. This is true for clang, clang++ and tblgen. I asked = on the llvm-dev list yesterday, and it turns out it's because GCC uses a = random seed on some symbols. Apparently, this can be controlled with the = -frandom-seed flag. I haven't tested if this is also the case for = Clang-built Clang. I'm not sure I understand the exact implications, but I'm wondering if = we could add the flag to the build scripts in FreeBSD in a way that both = satisfies the randomness criteria and makes builds deterministic? Thanks, Erik= --Apple-Mail-2097-81456243-- From owner-freebsd-toolchain@FreeBSD.ORG Sun Nov 14 23:10:02 2010 Return-Path: Delivered-To: freebsd-toolchain@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D563106566C for ; Sun, 14 Nov 2010 23:10:02 +0000 (UTC) (envelope-from dim@FreeBSD.org) Received: from tensor.andric.com (cl-327.ede-01.nl.sixxs.net [IPv6:2001:7b8:2ff:146::2]) by mx1.freebsd.org (Postfix) with ESMTP id 229E88FC12 for ; Sun, 14 Nov 2010 23:10:02 +0000 (UTC) Received: from [IPv6:2001:7b8:3a7:0:20d4:5ad4:8ef9:2ce4] (unknown [IPv6:2001:7b8:3a7:0:20d4:5ad4:8ef9:2ce4]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by tensor.andric.com (Postfix) with ESMTPSA id 45ED35C43; Mon, 15 Nov 2010 00:09:59 +0100 (CET) Message-ID: <4CE06C4F.7000002@FreeBSD.org> Date: Mon, 15 Nov 2010 00:10:07 +0100 From: Dimitry Andric Organization: The FreeBSD Project User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.13pre) Gecko/20101113 Lanikai/3.1.7pre MIME-Version: 1.0 To: Erik Cederstrand References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-toolchain@freebsd.org Subject: Re: Clang and -frandom-seed X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 14 Nov 2010 23:10:02 -0000 On 2010-11-14 23:29, Erik Cederstrand wrote: > I noticed that two consecutive builds of (GCC-built) Clang don't produce identical binaries. This is true for clang, clang++ and tblgen. I asked on the llvm-dev list yesterday, and it turns out it's because GCC uses a random seed on some symbols. Apparently, this can be controlled with the -frandom-seed flag. I haven't tested if this is also the case for Clang-built Clang. > > I'm not sure I understand the exact implications, but I'm wondering if we could add the flag to the build scripts in FreeBSD in a way that both satisfies the randomness criteria and makes builds deterministic? Hmm, it seems this is only used for C++ compilations, not for plain C. The gcc manual says: -frandom-seed=string This option provides a seed that GCC uses when it would otherwise use random numbers. It is used to generate certain symbol names that have to be different in every compiled file. It is also used to place unique stamps in coverage data files and the object files that produce them. You can use the -frandom-seed option to produce reproducibly identical object files. The string should be different for every file you compile. The stanza "have to be different in every compiled file" is what worries me. There is probably a good reason that gcc needs to be able to emit unique function names. In contrib/gcc/tree.c, there's this piece of code: /* Generate a name for a function unique to this translation unit. TYPE is some string to identify the purpose of this function to the linker or collect2. */ tree get_file_function_name_long (const char *type) { ... /* We don't have anything that we know to be unique to this translation unit, so use what we do have and throw in some randomness. */ ... sprintf (q + len, "_%08X_%08X", crc32_string (0, name), crc32_string (0, flag_random_seed)); So this is all on purpose, and I think it would be a bad idea to disable it, unless we fully understand the consequences. On the other hand, the requirement "The string should be different for every file you compile", could possibly be fulfilled. Maybe by using the filename, relative to $SRCDIR, that is being compiled as "seed"? This would be unique for each compiled file, but still give the same result for each build, and also be independent of the particular machine you are building on. From owner-freebsd-toolchain@FreeBSD.ORG Mon Nov 15 10:10:39 2010 Return-Path: Delivered-To: freebsd-toolchain@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E51C2106564A; Mon, 15 Nov 2010 10:10:38 +0000 (UTC) (envelope-from erik@cederstrand.dk) Received: from csmtp3.one.com (csmtp3.one.com [91.198.169.23]) by mx1.freebsd.org (Postfix) with ESMTP id 7A2348FC08; Mon, 15 Nov 2010 10:10:38 +0000 (UTC) Received: from [192.168.0.22] (0x573fa596.cpe.ge-1-1-0-1109.ronqu1.customer.tele.dk [87.63.165.150]) by csmtp3.one.com (Postfix) with ESMTP id A78642406D6E; Mon, 15 Nov 2010 10:10:36 +0000 (UTC) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: multipart/signed; boundary=Apple-Mail-2114-123506905; protocol="application/pkcs7-signature"; micalg=sha1 From: Erik Cederstrand In-Reply-To: <4CE06C4F.7000002@FreeBSD.org> Date: Mon, 15 Nov 2010 11:10:35 +0100 Message-Id: References: <4CE06C4F.7000002@FreeBSD.org> To: Dimitry Andric X-Mailer: Apple Mail (2.1081) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-toolchain@freebsd.org Subject: Re: Clang and -frandom-seed X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Nov 2010 10:10:39 -0000 --Apple-Mail-2114-123506905 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Den 15/11/2010 kl. 00.10 skrev Dimitry Andric: > On 2010-11-14 23:29, Erik Cederstrand wrote: >> I noticed that two consecutive builds of (GCC-built) Clang don't = produce identical binaries. This is true for clang, clang++ and tblgen. = I asked on the llvm-dev list yesterday, and it turns out it's because = GCC uses a random seed on some symbols. Apparently, this can be = controlled with the -frandom-seed flag. I haven't tested if this is also = the case for Clang-built Clang. >>=20 > [...] > So this is all on purpose, and I think it would be a bad idea to = disable > it, unless we fully understand the consequences. >=20 > On the other hand, the requirement "The string should be different for > every file you compile", could possibly be fulfilled. Maybe by using > the filename, relative to $SRCDIR, that is being compiled as "seed"? >=20 > This would be unique for each compiled file, but still give the same > result for each build, and also be independent of the particular = machine > you are building on. I was thinking of something along the same lines. I think we agree that = it only needs to be random across files, not across builds. Someone on = llvm-dev also suggested using the path (either full or relative to src/) = as a seed. Where in the build scripts would I need to add this flag? Something = like: CXXFLAGS +=3D -frandom-seed=3D${.TARGET} in src.conf? Thanks, Erik= --Apple-Mail-2114-123506905-- From owner-freebsd-toolchain@FreeBSD.ORG Fri Nov 19 15:51:32 2010 Return-Path: Delivered-To: freebsd-toolchain@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2CFF1065694; Fri, 19 Nov 2010 15:51:32 +0000 (UTC) (envelope-from erik@cederstrand.dk) Received: from csmtp1.one.com (csmtp1.one.com [195.47.247.21]) by mx1.freebsd.org (Postfix) with ESMTP id 4683F8FC1B; Fri, 19 Nov 2010 15:51:32 +0000 (UTC) Received: from [192.168.1.9] (0x573bad9c.ronqu1.dynamic.dsl.tele.dk [87.59.173.156]) by csmtp1.one.com (Postfix) with ESMTP id 8BF4F1BC00E3B; Fri, 19 Nov 2010 15:51:30 +0000 (UTC) From: Erik Cederstrand Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: multipart/signed; boundary=Apple-Mail-2606-489561436; protocol="application/pkcs7-signature"; micalg=sha1 Date: Fri, 19 Nov 2010 16:51:30 +0100 In-Reply-To: References: <4CE06C4F.7000002@FreeBSD.org> Message-Id: <68B6258D-6853-4FF0-BE09-13B8E99BC874@cederstrand.dk> X-Mailer: Apple Mail (2.1081) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-toolchain@freebsd.org, Dimitry Andric Subject: Re: Clang and -frandom-seed X-BeenThere: freebsd-toolchain@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Maintenance of FreeBSD's integrated toolchain List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Nov 2010 15:51:32 -0000 --Apple-Mail-2606-489561436 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Den 15/11/2010 kl. 11.10 skrev Erik Cederstrand: >=20 > Den 15/11/2010 kl. 00.10 skrev Dimitry Andric: >=20 >> On 2010-11-14 23:29, Erik Cederstrand wrote: >>> I noticed that two consecutive builds of (GCC-built) Clang don't = produce identical binaries. This is true for clang, clang++ and tblgen. = I asked on the llvm-dev list yesterday, and it turns out it's because = GCC uses a random seed on some symbols. Apparently, this can be = controlled with the -frandom-seed flag. I haven't tested if this is also = the case for Clang-built Clang. >>>=20 >> [...] >> So this is all on purpose, and I think it would be a bad idea to = disable >> it, unless we fully understand the consequences. >>=20 >> On the other hand, the requirement "The string should be different = for >> every file you compile", could possibly be fulfilled. Maybe by using >> the filename, relative to $SRCDIR, that is being compiled as "seed"? >>=20 >> This would be unique for each compiled file, but still give the same >> result for each build, and also be independent of the particular = machine >> you are building on. >=20 > I was thinking of something along the same lines. I think we agree = that it only needs to be random across files, not across builds. Someone = on llvm-dev also suggested using the path (either full or relative to = src/) as a seed. >=20 > Where in the build scripts would I need to add this flag? Something = like: >=20 > CXXFLAGS +=3D -frandom-seed=3D${.TARGET} >=20 > in src.conf? Poking around, I decided to add this patch: # svn diff lib/clang/clang.build.mk Index: lib/clang/clang.build.mk =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- lib/clang/clang.build.mk (revision 215422) +++ lib/clang/clang.build.mk (working copy) @@ -28,6 +28,13 @@ CXXFLAGS+=3D-fno-rtti .endif =20 +.ifdef WITH_DETERMINISTIC +CXXFLAGS+=3D-frandom-seed=3D0 +.endif and added "WITH_DETERMINISTIC=3Dtrue" to /etc/src.conf. The = "random-seed=3D0" should ensure that the same random elements are = generated every time. I then ran two buildworlds with the same MAKEOBJDIRPREFIX but two = different DESTDIRs, and compared the two clang binaries. The random-seed = option does show up un the log, so it's getting picked up, but = apparently it was not enough, as the random elements are still = different. Any hints on where in the build infrastructure I should add the flag, or = what to grep for in the buildworld log to find out what's wrong? Also, how can I compile just clang? I tried "cd src/usr.bin/clang; make" = but it dies violently: = /usr/home/erik/freebsd/head/src/usr.bin/clang/clang/../../../contrib/llvm/= tools/clang/include/clang/Basic/Diagnostic.h: At global scope: = /usr/home/erik/freebsd/head/src/usr.bin/clang/clang/../../../contrib/llvm/= tools/clang/include/clang/Basic/Diagnostic.h:999: error: expected ',' or = '...' before '&' token = /usr/home/erik/freebsd/head/src/usr.bin/clang/clang/../../../contrib/llvm/= tools/clang/include/clang/Basic/Diagnostic.h:1000: error: ISO C++ = forbids declaration of 'LangOptions' with no type = /usr/home/erik/freebsd/head/src/usr.bin/clang/clang/../../../contrib/llvm/= tools/clang/include/clang/Basic/Diagnostic.h:1019: error: expected = declaration before '}' token *** Error code 1 Stop in /usr/home/erik/freebsd/head/src/usr.bin/clang/clang. *** Error code 1 Stop in /usr/home/erik/freebsd/head/src/usr.bin/clang. Thanks, Erik= --Apple-Mail-2606-489561436--