Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 29 Jul 2012 21:20:51 +0800
From:      Luba Tang <lubatang@gmail.com>
To:        David Chisnall <theraven@freebsd.org>
Cc:        Pedro Giffuni <pfg@freebsd.org>, "freebsd-toolchain@freebsd.org" <freebsd-toolchain@freebsd.org>
Subject:   Re: BSD ld (was Re: MCLinker and llvm-config)
Message-ID:  <CAMW0cx=D-xJ8LsmTpc3%2BnJdpeYzT%2BY0Yu80mf9gE8ncRK55fDQ@mail.gmail.com>
In-Reply-To: <F3939D80-0026-4315-B1B7-D17065AA5022@freebsd.org>
References:  <1343484950.37325.YahooMailNeo@web113506.mail.gq1.yahoo.com> <F3939D80-0026-4315-B1B7-D17065AA5022@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
> The linker's ELF generation support is similarly overlapping with that of
> the compiler, and I would much rather that we have a single implementation
> in the base system than two.
>

There are some thing I can share.
One years ago, when we were brainstoming the idea of MCLinker, we also
discussed similar idea - "Can we leverage ELF reading/generation part of
LLVM?". Unfortunately, we found the answer is negative when we were in the
design stage. The reasons are:

1. For performance reasons, linkers does not read the whole input files.
Linkers read a piece of input files on demand.

Not only MCLinker, but also Google gold linker, we do not read the whole
object files at once. We read a piece of object files, interpret them, and
then drop them away immediately. When we need the same piece of object
files, we read it again.

This is because the scale of the numbers of relocations and symbols are
usually 1~100 million, it's huge, but linkers use only a few of them in one
process. For smaller memory footprint, linkers do not keep them in the
memory, linkers read them from input files on demand.

Takes Google gold for example, she reads relocations on-demand. Everywhen
she needs a relocation, she reads the input file again and interpret
relocation entries. This approach is very efficient. Because linkers
usually mmap input files on pages, the file I/O is small. Google gold saves
huge amount of memory at the small cost of re-interpreting time.

However, the other tools (objdump, nm, and so on) don't need handle with
such huge scale problem. They usually simply keeps everything in memory.
That is one reason why linkers should have their own special readers and do
not reuse the readers of the other tools.

2. Like compilers, linkers also have intermeidate representation (IR).
Every linker needs a customized reader to build IR.

MCLinker bases on fragment-based model, GNU ld bases on BFD model, ld64
bases on atom-based model,...
Different model has its own unique strength and ususally favor certain file
format. MCLinker and Google gold favor ELF, ld64 favors MachO, and GNU ld
favors COFF+ (I guess, may not right). All linker IRs are a kind of
directive acyclic graph (DAG). The main differences of these IRs are the
direction of the edges. The readers of the other tools have no idea about
linker IR, and this makes reusing is a slim chance.

Best regards,
Luba


David_______________________________________________
> freebsd-toolchain@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-toolchain
> To unsubscribe, send any mail to "
> freebsd-toolchain-unsubscribe@freebsd.org"
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMW0cx=D-xJ8LsmTpc3%2BnJdpeYzT%2BY0Yu80mf9gE8ncRK55fDQ>