Date: Sun, 29 Jul 2012 21:20:51 +0800 From: Luba Tang <lubatang@gmail.com> To: David Chisnall <theraven@freebsd.org> Cc: Pedro Giffuni <pfg@freebsd.org>, "freebsd-toolchain@freebsd.org" <freebsd-toolchain@freebsd.org> Subject: Re: BSD ld (was Re: MCLinker and llvm-config) Message-ID: <CAMW0cx=D-xJ8LsmTpc3%2BnJdpeYzT%2BY0Yu80mf9gE8ncRK55fDQ@mail.gmail.com> In-Reply-To: <F3939D80-0026-4315-B1B7-D17065AA5022@freebsd.org> References: <1343484950.37325.YahooMailNeo@web113506.mail.gq1.yahoo.com> <F3939D80-0026-4315-B1B7-D17065AA5022@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
> The linker's ELF generation support is similarly overlapping with that of > the compiler, and I would much rather that we have a single implementation > in the base system than two. > There are some thing I can share. One years ago, when we were brainstoming the idea of MCLinker, we also discussed similar idea - "Can we leverage ELF reading/generation part of LLVM?". Unfortunately, we found the answer is negative when we were in the design stage. The reasons are: 1. For performance reasons, linkers does not read the whole input files. Linkers read a piece of input files on demand. Not only MCLinker, but also Google gold linker, we do not read the whole object files at once. We read a piece of object files, interpret them, and then drop them away immediately. When we need the same piece of object files, we read it again. This is because the scale of the numbers of relocations and symbols are usually 1~100 million, it's huge, but linkers use only a few of them in one process. For smaller memory footprint, linkers do not keep them in the memory, linkers read them from input files on demand. Takes Google gold for example, she reads relocations on-demand. Everywhen she needs a relocation, she reads the input file again and interpret relocation entries. This approach is very efficient. Because linkers usually mmap input files on pages, the file I/O is small. Google gold saves huge amount of memory at the small cost of re-interpreting time. However, the other tools (objdump, nm, and so on) don't need handle with such huge scale problem. They usually simply keeps everything in memory. That is one reason why linkers should have their own special readers and do not reuse the readers of the other tools. 2. Like compilers, linkers also have intermeidate representation (IR). Every linker needs a customized reader to build IR. MCLinker bases on fragment-based model, GNU ld bases on BFD model, ld64 bases on atom-based model,... Different model has its own unique strength and ususally favor certain file format. MCLinker and Google gold favor ELF, ld64 favors MachO, and GNU ld favors COFF+ (I guess, may not right). All linker IRs are a kind of directive acyclic graph (DAG). The main differences of these IRs are the direction of the edges. The readers of the other tools have no idea about linker IR, and this makes reusing is a slim chance. Best regards, Luba David_______________________________________________ > freebsd-toolchain@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-toolchain > To unsubscribe, send any mail to " > freebsd-toolchain-unsubscribe@freebsd.org" >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAMW0cx=D-xJ8LsmTpc3%2BnJdpeYzT%2BY0Yu80mf9gE8ncRK55fDQ>