From owner-freebsd-hackers@freebsd.org Wed Nov 20 11:48:26 2019 Return-Path: Delivered-To: freebsd-hackers@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 849751B3BBA for ; Wed, 20 Nov 2019 11:48:26 +0000 (UTC) (envelope-from se@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 47J1DB2wzdz3Ckd; Wed, 20 Nov 2019 11:48:26 +0000 (UTC) (envelope-from se@freebsd.org) Received: from Stefans-MBP-449.fritz.box (p200300CD5F3FC800082CDFE52D1EA3DC.dip0.t-ipconnect.de [IPv6:2003:cd:5f3f:c800:82c:dfe5:2d1e:a3dc]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) (Authenticated sender: se/mail) by smtp.freebsd.org (Postfix) with ESMTPSA id DE4491C67D; Wed, 20 Nov 2019 11:48:25 +0000 (UTC) (envelope-from se@freebsd.org) Subject: Re: Executable size difference between clang and gcc9 To: freebsd-hackers@freebsd.org References: From: =?UTF-8?Q?Stefan_E=c3=9fer?= Cc: dewaynegeraghty@gmail.com Message-ID: Date: Wed, 20 Nov 2019 12:48:22 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Language: en-US Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Nov 2019 11:48:26 -0000 Am 20.11.19 um 04:09 schrieb Dewayne Geraghty: > I noticed the executable a size difference between clang 8.0.1 and gcc > 9.2.0 of a simple test code, build on FreeBSD12.1S (r353671M) below.: > > Differences in object code seem reasonable: Compiling with "-flto -c" gives quite different file types, depending on the compiler used. > # clang -O2 -march=haswell -flto -c "qdate.c" > -rw-r----- 1 root wheel 3896 8 Nov 12:59 qdate.o # file qdate.o qdate.o: LLVM IR bitcode # ls -l qdate.o -rw-r--r-- 1 se se 4092 20 Nov. 12:08 qdate.o > # gcc9 -O2 -march=haswell -flto -c "qdate.c" ; ls -l qdate.o a.out > -rw-r----- 1 root wheel 5256 8 Nov 13:00 qdate.o # file qdate.o qdate.o: ELF 64-bit LSB relocatable, x86-64, version 1 (FreeBSD), not stripped # ls -l qdate.o -rw-r--r-- 1 se se 5248 20 Nov. 12:07 qdate.o > But the executable sizes? For one thing, you did not strip the binary, and there are different amounts of debug information in the binaries. It appears that you used the GNU ld from binutils for GCC, but the llvm linker ld.lld for CLANG, and they may use different alignment of regions or differ in other aspects. > # clang -O2 -march=haswell -flto "qdate.c" ; ls -l qdate.o a.out > -rwxr-x--- 1 root wheel 16360 8 Nov 13:10 a.out # ls -l a.out -rwxr-xr-x 1 se se 24728 20 Nov. 12:17 a.out # size a.out text data bss dec hex filename 2065 448 16 2529 0x9e1 a.out # strip a.out # ls -l a.out -rwxr-xr-x 1 se se 15120 20 Nov. 12:17 a.out > # gcc9 -O2 -march=haswell -flto "qdate.c" ;ls -l qdate.o a.out > -rwxr-x--- 1 root wheel 8736 8 Nov 13:09 a.out # ls -l a.out -rwxr-xr-x 1 se se 14472 20 Nov. 12:18 a.out # size a.out text data bss dec hex filename 2023 464 24 2511 0x9cf a.out # strip a.out # ls -l a.out -rwxr-xr-x 1 se se 5320 20 Nov. 12:19 a.out > Is this size variation expected, and what is contributing to this > difference? There is no difference ;-) While the file size after stripping remains higher for the clang case, the actual code and data segment sizes are identical. The file size difference is due to page alignment of sections performed by the LLVM linker. It slightly increases the file size of the binary, but this is less relevant for typical program sizes. > The executeables are the same size with/without lto; and both link to > /lib/libc > a.out: > libc.so.7 => /lib/libc.so.7 (0x800647000) Yes, -flto cannot make a difference when compiling just a single source file. The library is not subject to link time optimizations, anyway. > */* Sample code*/* Reformatted for readability: > #include > #include > > int main (int argc, char **argv) > { > struct timeval tv; gettimeofday(&tv, NULL); > if (argc > 1) > printf("%ld.%ld\n",tv.tv_sec,tv.tv_usec); > else > printf("%ld\n",tv.tv_sec); > } > The verbose compile/link command is available at > http://www.heuristicsystems.com/FreeBSD-compiler/ > contains: clang.lis gcc9.lis qdate.c > > PS who said anything placing malware on the end of executables, at the > compilation step? Really I'm not paranoid... :) Use hd to look at the generated binary (after stripping, to get rid of the debug symbols) and you'll see that there is nothing hidden. You can also disassemble the file (with source lines as comments) to check the validity of the generated code. But with -O2 you'll get quite some re-arrangement of instructions relative to the source lines. If you want to disassemble the program (compiled with "-g"): # llvm-objdump90 --source -g a.out (Use a different objdump version, if you do only have an older clang on your system. The llvm objdump works equally well on binaries generated by GCC and CLANG.) My tests were performed on a CURRENT/amd64 system. Regards, STefan