Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 20 Nov 2019 12:48:22 +0100
From:      =?UTF-8?Q?Stefan_E=c3=9fer?= <se@freebsd.org>
To:        freebsd-hackers@freebsd.org
Cc:        dewaynegeraghty@gmail.com
Subject:   Re: Executable size difference between clang and gcc9
Message-ID:  <c8363caa-b6b8-12e7-d11a-f122afc9dd74@freebsd.org>
In-Reply-To: <CAGnMC6ptpLNTHMXXYroV28WLN7dqLKYSL1jCsa=h5uZZmuLVZQ@mail.gmail.com>
References:  <CAGnMC6ptpLNTHMXXYroV28WLN7dqLKYSL1jCsa=h5uZZmuLVZQ@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Am 20.11.19 um 04:09 schrieb Dewayne Geraghty:
>  I noticed the executable a size difference between clang 8.0.1 and gcc
> 9.2.0 of a simple test code, build on FreeBSD12.1S (r353671M) below.:
> 
> Differences in object code seem reasonable:

Compiling with "-flto -c" gives quite different file types,
depending on the compiler used.

> # clang -O2 -march=haswell -flto -c "qdate.c"
> -rw-r-----  1 root  wheel  3896  8 Nov 12:59 qdate.o

# file qdate.o
qdate.o: LLVM IR bitcode
# ls -l qdate.o
-rw-r--r--  1 se  se  4092 20 Nov. 12:08 qdate.o

> # gcc9 -O2 -march=haswell -flto -c "qdate.c" ; ls -l qdate.o a.out
> -rw-r-----  1 root  wheel  5256  8 Nov 13:00 qdate.o

# file qdate.o
qdate.o: ELF 64-bit LSB relocatable, x86-64, version 1 (FreeBSD), not
stripped
# ls -l qdate.o
-rw-r--r--  1 se  se  5248 20 Nov. 12:07 qdate.o

> But the executable sizes?

For one thing, you did not strip the binary, and there are different
amounts of debug information in the binaries. It appears that you
used the GNU ld from binutils for GCC, but the llvm linker ld.lld
for CLANG, and they may use different alignment of regions or differ
in other aspects.

> # clang -O2 -march=haswell -flto "qdate.c" ; ls -l qdate.o a.out
> -rwxr-x---  1 root  wheel  16360  8 Nov 13:10 a.out

# ls -l a.out
-rwxr-xr-x  1 se  se  24728 20 Nov. 12:17 a.out
#  size a.out
  text   data   bss    dec     hex   filename
  2065    448    16   2529   0x9e1   a.out
# strip a.out
# ls -l a.out
-rwxr-xr-x  1 se  se  15120 20 Nov. 12:17 a.out

> # gcc9 -O2 -march=haswell -flto "qdate.c" ;ls -l qdate.o a.out
> -rwxr-x---  1 root  wheel  8736  8 Nov 13:09 a.out

# ls -l a.out
-rwxr-xr-x  1 se  se  14472 20 Nov. 12:18 a.out
# size a.out
  text   data   bss    dec     hex   filename
  2023    464    24   2511   0x9cf   a.out
# strip a.out
# ls -l a.out
-rwxr-xr-x  1 se  se  5320 20 Nov. 12:19 a.out

> Is this size variation expected, and what is contributing to this
> difference?

There is no difference ;-)

While the file size after stripping remains higher for the clang case,
the actual code and data segment sizes are identical.

The file size difference is due to page alignment of sections performed
by the LLVM linker. It slightly increases the file size of the binary,
but this is less relevant for typical program sizes.

> The executeables are the same size with/without lto; and both link to
> /lib/libc
> a.out:
>         libc.so.7 => /lib/libc.so.7 (0x800647000)

Yes, -flto cannot make a difference when compiling just a single source
file. The library is not subject to link time optimizations, anyway.

> */* Sample code*/*

Reformatted for readability:

> #include <stdio.h>
> #include <sys/time.h>
> 
> int main (int argc, char **argv)
> {
> 	struct timeval tv; gettimeofday(&tv, NULL);
> 	if (argc > 1)
> 		printf("%ld.%ld\n",tv.tv_sec,tv.tv_usec);
> 	else
> 		printf("%ld\n",tv.tv_sec);
> }

> The verbose compile/link command is available at
> http://www.heuristicsystems.com/FreeBSD-compiler/
> contains: clang.lis gcc9.lis qdate.c
> 
> PS who said anything placing malware on the end of executables, at the
> compilation step?  Really I'm not paranoid... :)

Use hd to look at the generated binary (after stripping, to get rid of
the debug symbols) and you'll see that there is nothing hidden. You can
also disassemble the file (with source lines as comments) to check the
validity of the generated code. But with -O2 you'll get quite some
re-arrangement of instructions relative to the source lines.

If you want to disassemble the program (compiled with "-g"):

# llvm-objdump90 --source -g a.out

(Use a different objdump version, if you do only have an older clang on
your system. The llvm objdump works equally well on binaries generated
by GCC and CLANG.)

My tests were performed on a CURRENT/amd64 system.

Regards, STefan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?c8363caa-b6b8-12e7-d11a-f122afc9dd74>